100+ datasets found
  1. The metagenome sequencing data have been deposited in the European...

    • catalog.data.gov
    • datasets.ai
    Updated Mar 11, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2023). The metagenome sequencing data have been deposited in the European Nucleotide Archive (ENA). [Dataset]. https://catalog.data.gov/dataset/the-metagenome-sequencing-data-have-been-deposited-in-the-european-nucleotide-archive-ena-8ad17
    Explore at:
    Dataset updated
    Mar 11, 2023
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    The raw sequencing data for this study have been deposited in the European Nucleotide Archive (ENA) at EMBL-EBI under accession number PRJEB40814 with the following BioSample numbers: SAMEA7465213 (sample DWDS A1), SAMEA7465214 (DWDS A2), SAMEA7465217 (DWDS B1), SAMEA7465218 (DWDS B2), SAMEA7465220 (DWDS C1), SAMEA7465221 (DWDS C2), SAMEA7465222 (DWDS D1), SAMEA7465223 (DWDS D2), SAMEA7465226 (DWDS E1), and SAMEA7465227 (DWDS E2). This dataset is associated with the following publication: Gomez-Alvarez, V., S. Siponen, A. Kauppinen, A. Hokajarvi , A. Tiwari, A. Sarekoski, I.T. Miettinen, E. Torvinen, and T. Pitkanen. A comparative analysis employing a gene- and genome-centric metagenomic approach reveals changes in composition, function, and activity in waterworks with different treatment processes and source water in Finland. WATER RESEARCH. Elsevier Science Ltd, New York, NY, USA, 229: 119495, (2023).

  2. b

    European Nucleotide Archive

    • bioregistry.io
    Updated Feb 10, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). European Nucleotide Archive [Dataset]. http://identifiers.org/re3data:r3d100010527
    Explore at:
    Dataset updated
    Feb 10, 2022
    Description

    The European Nucleotide Archive (ENA) captures and presents information relating to experimental workflows that are based around nucleotide sequencing. ENA is made up of a number of distinct databases that includes EMBL-Bank, the Sequence Read Archive (SRA) and the Trace Archive each with their own data formats and standards. This collection references Embl-Bank identifiers.

  3. d

    European Nucleotide Archive (ENA)

    • dknet.org
    • rrid.site
    • +1more
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). European Nucleotide Archive (ENA) [Dataset]. http://identifiers.org/RRID:SCR_006515
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    Public archive providing a comprehensive record of the world''''s nucleotide sequencing information, covering raw sequencing data, sequence assembly information and functional annotation. All submitted data, once public, will be exchanged with the NCBI and DDBJ as part of the INSDC data exchange agreement. The European Nucleotide Archive (ENA) captures and presents information relating to experimental workflows that are based around nucleotide sequencing. A typical workflow includes the isolation and preparation of material for sequencing, a run of a sequencing machine in which sequencing data are produced and a subsequent bioinformatic analysis pipeline. ENA records this information in a data model that covers input information (sample, experimental setup, machine configuration), output machine data (sequence traces, reads and quality scores) and interpreted information (assembly, mapping, functional annotation). Data arrive at ENA from a variety of sources including submissions of raw data, assembled sequences and annotation from small-scale sequencing efforts, data provision from the major European sequencing centers and routine and comprehensive exchange with their partners in the International Nucleotide Sequence Database Collaboration (INSDC). Provision of nucleotide sequence data to ENA or its INSDC partners has become a central and mandatory step in the dissemination of research findings to the scientific community. ENA works with publishers of scientific literature and funding bodies to ensure compliance with these principles and to provide optimal submission systems and data access tools that work seamlessly with the published literature. ENA is made up of a number of distinct databases that includes the EMBL Nucleotide Sequence Database (Embl-Bank), the newly established Sequence Read Archive (SRA) and the Trace Archive. The main tool for downloading ENA data is the ENA Browser, which is available through REST URLs for easy programmatic use. All ENA data are available through the ENA Browser. Note: EMBL Nucleotide Sequence Database (EMBL-Bank) is entirely included within this resource.

  4. f

    List of whole genome resequenced datasets available on Sequence Read...

    • datasetcatalog.nlm.nih.gov
    • figshare.com
    Updated Jan 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Forcina, Giovanni; Sadanandan, Keren R.; Wu, Meng Yue; Low, Gabriel Weijie; Baldwin, Maude W.; Wu, Shaoyuan; Rheindt, Frank E.; van Grouw, Hein; Edwards, Scott V.; Gwee, Chyi Yin (2023). List of whole genome resequenced datasets available on Sequence Read Archive, European Nucleotide Archive or chickenSD for Gallus gallus and used in this study. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001056782
    Explore at:
    Dataset updated
    Jan 19, 2023
    Authors
    Forcina, Giovanni; Sadanandan, Keren R.; Wu, Meng Yue; Low, Gabriel Weijie; Baldwin, Maude W.; Wu, Shaoyuan; Rheindt, Frank E.; van Grouw, Hein; Edwards, Scott V.; Gwee, Chyi Yin
    Description

    Local chickens that do not confer to a breed are labelled as “village”. (XLSX)

  5. m

    Modal

    • data.mendeley.com
    Updated Feb 5, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Radu Constantin Parpala (2020). Modal [Dataset]. http://doi.org/10.17632/93j24wpf55.1
    Explore at:
    Dataset updated
    Feb 5, 2020
    Authors
    Radu Constantin Parpala
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Ansys archive file

  6. The OHEJP BeONE Project – Escherichia coli genome assembly dataset

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    bin, zip
    Updated Jul 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Verónica Mixão; Verónica Mixão; Miguel Pinto; Miguel Pinto; João Paulo Gomes; João Paulo Gomes; Daniel Sobral; Daniel Sobral; Holger Brendebach; Holger Brendebach; Carlus Deneke; Carlus Deneke; Simon Tausch; Simon Tausch; Adriano Di Pasquale; Adriano Di Pasquale; Claudia Swart-Coipan; Claudia Swart-Coipan; Ewelina Iwan; Jörg Linde; Jörg Linde; Karin Lagesen; Karin Lagesen; Liljana Petrovska; Liljana Petrovska; Mohammed Umaer Naseer; Rolf Sommer Kaas; Rolf Sommer Kaas; Sandra Simon; Katrine Joensen; Katrine Joensen; Kristoffer Kiil; Sofie Nielsen; Sofie Nielsen; Vítor Borges; Vítor Borges; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI; Ewelina Iwan; Mohammed Umaer Naseer; Sandra Simon; Kristoffer Kiil; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI (2023). The OHEJP BeONE Project – Escherichia coli genome assembly dataset [Dataset]. http://doi.org/10.5281/zenodo.7802728
    Explore at:
    zip, binAvailable download formats
    Dataset updated
    Jul 24, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Verónica Mixão; Verónica Mixão; Miguel Pinto; Miguel Pinto; João Paulo Gomes; João Paulo Gomes; Daniel Sobral; Daniel Sobral; Holger Brendebach; Holger Brendebach; Carlus Deneke; Carlus Deneke; Simon Tausch; Simon Tausch; Adriano Di Pasquale; Adriano Di Pasquale; Claudia Swart-Coipan; Claudia Swart-Coipan; Ewelina Iwan; Jörg Linde; Jörg Linde; Karin Lagesen; Karin Lagesen; Liljana Petrovska; Liljana Petrovska; Mohammed Umaer Naseer; Rolf Sommer Kaas; Rolf Sommer Kaas; Sandra Simon; Katrine Joensen; Katrine Joensen; Kristoffer Kiil; Sofie Nielsen; Sofie Nielsen; Vítor Borges; Vítor Borges; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI; Ewelina Iwan; Mohammed Umaer Naseer; Sandra Simon; Kristoffer Kiil; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset

    This dataset comprises the genome assemblies of 308 Escherichia coli samples collected by the BeONE Consortium on behalf of the One Health European Joint Programme “BeONE: Building Integrative Tools for One Health Surveillance” (https://onehealthejp.eu/jrp-beone/). Additionally, a complementary dataset is also made available (https://zenodo.org/record/7120057), comprising genome assemblies of 1,999 E. coli samples selected among the Whole-Genome Sequencing (WGS) data publicly available in the European Nucleotide Archive (ENA) or in the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA).

    File “BeONE_Ec_metadata.xlsx” contains the genome assembly statistics for each isolate, including European Nucleotide Archive accession numbers, in-silico Multi Locus Sequence Type and Serotype, and information regarding year of sampling, country and source.

    The archive “BeONE_Ec_assemblies.zip” contains all the genome assemblies (.fasta format) of each isolate presented in the metadata file.

    Dataset selection and curation

    This anonymized dataset of E. coli genome assemblies was generated using Next Generation Sequencing data collected within the BeONE Consortium available at the European Nucleotide Archive under BioProject Accession Number PRJEB57098. Read quality control, trimming and assembly were performed with Aquamis v1.3.9 (Deneke et al. 2021) using default parameters. Assembly quality control (QC), including contamination assessment, as well as MLST ST determination were performed with the same pipeline. All genome assemblies passing the QC were included in the final dataset. Among the others, we noticed that a considerable proportion of assemblies was flagged as “QC fail” exclusively due to the “NumContamSNVs” parameter, suggesting that this setting might have been too strict. After manual inspection of a random subset, assemblies for which the percentage of reads corresponding to the correct species was >98% were recovered and integrated in the final dataset (those samples are labeled in the Metadata file). In total, 308 isolates passed the dataset curation step and were included in the final dataset. In-silico serotyping was performed with seq_typing v2.2.

    Funding

    This work was supported by funding from the European Union’s Horizon 2020 Research and Innovation programme under grant agreement No 773830: One Health European Joint Programme.

    Acknowledgements

    We thank the National Distributed Computing Infrastructure of Portugal (INCD) for providing the necessary resources to run the genome assemblies. INCD was funded by FCT and FEDER under the project 22153-01/SAICT/2016.

  7. r

    Australian Nucleotide (DNA/RNA) and Protein sequences from Australian...

    • researchdata.edu.au
    Updated Jun 4, 2014
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    QFAB Bioinformatics (2014). Australian Nucleotide (DNA/RNA) and Protein sequences from Australian organisms in the species Oedura gracilis () [Dataset]. https://researchdata.edu.au/australian-nucleotide-dnarna-oedura-gracilis/442223
    Explore at:
    Dataset updated
    Jun 4, 2014
    Dataset provided by
    QFAB
    Authors
    QFAB Bioinformatics
    Area covered
    Australia
    Description

    This data collection contains all currently published nucleotide (DNA/RNA) and protein sequences from Australian Oedura gracilis. Other information about this group:

    The nucleotide (DNA/RNA) and protein sequences have been sourced through the European Nucleotide Archive (ENA) and Universal Protein Resource (UniProt), databases that contains comprehensive sets of nucleotide (DNA/RNA) and protein sequences from all organisms that have been published by the International Research Community.

    The identification of species in Oedura gracilis as Australian dwelling organisms has been achieved by accessing the Australian Plant Census (APC) or Australian Faunal Directory (AFD) through the Atlas of Living Australia.

  8. of ExpressionPlot: a web-based framework for analysis of RNA-Seq and...

    • springernature.figshare.com
    • datasetcatalog.nlm.nih.gov
    zip
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brad Friedman; Tom Maniatis (2023). of ExpressionPlot: a web-based framework for analysis of RNA-Seq and microarray gene expression data [Dataset]. http://doi.org/10.6084/m9.figshare.10035245.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Brad Friedman; Tom Maniatis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    of ExpressionPlot: a web-based framework for analysis of RNA-Seq and microarray gene expression data

  9. The OHEJP BeONE Project – Listeria monocytogenes genome assembly dataset

    • zenodo.org
    • openagrar.de
    • +2more
    bin, zip
    Updated Jul 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Verónica Mixão; Verónica Mixão; Miguel Pinto; Miguel Pinto; João Paulo Gomes; João Paulo Gomes; Daniel Sobral; Daniel Sobral; Holger Brendebach; Holger Brendebach; Carlus Deneke; Carlus Deneke; Simon Tausch; Simon Tausch; Adriano Di Pasquale; Adriano Di Pasquale; Claudia Swart-Coipan; Claudia Swart-Coipan; Ewelina Iwan; Jörg Linde; Jörg Linde; Karin Lagesen; Karin Lagesen; Liljana Petrovska; Liljana Petrovska; Mohammed Umaer Naseer; Rolf Sommer Kaas; Rolf Sommer Kaas; Sandra Simon; Katrine Joensen; Katrine Joensen; Kristoffer Kiil; Sofie Nielsen; Sofie Nielsen; Vítor Borges; Vítor Borges; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI; Ewelina Iwan; Mohammed Umaer Naseer; Sandra Simon; Kristoffer Kiil; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI (2023). The OHEJP BeONE Project – Listeria monocytogenes genome assembly dataset [Dataset]. http://doi.org/10.5281/zenodo.7267487
    Explore at:
    bin, zipAvailable download formats
    Dataset updated
    Jul 24, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Verónica Mixão; Verónica Mixão; Miguel Pinto; Miguel Pinto; João Paulo Gomes; João Paulo Gomes; Daniel Sobral; Daniel Sobral; Holger Brendebach; Holger Brendebach; Carlus Deneke; Carlus Deneke; Simon Tausch; Simon Tausch; Adriano Di Pasquale; Adriano Di Pasquale; Claudia Swart-Coipan; Claudia Swart-Coipan; Ewelina Iwan; Jörg Linde; Jörg Linde; Karin Lagesen; Karin Lagesen; Liljana Petrovska; Liljana Petrovska; Mohammed Umaer Naseer; Rolf Sommer Kaas; Rolf Sommer Kaas; Sandra Simon; Katrine Joensen; Katrine Joensen; Kristoffer Kiil; Sofie Nielsen; Sofie Nielsen; Vítor Borges; Vítor Borges; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI; Ewelina Iwan; Mohammed Umaer Naseer; Sandra Simon; Kristoffer Kiil; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset

    This dataset comprises the genome assemblies of 1,426 Listeria monocytogenes samples collected by the BeONE Consortium on behalf of the One Health European Joint Programme “BeONE: Building Integrative Tools for One Health Surveillance” (https://onehealthejp.eu/jrp-beone/). Additionally, a complementary dataset is also made available (https://zenodo.org/record/7116878), comprising genome assemblies of 1,874 L. monocytogenes samples selected among the Whole-Genome Sequencing (WGS) data publicly available in the European Nucleotide Archive (ENA) or in the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA).

    File “BeONE_Lm_metadata.xlsx” contains the genome assembly statistics for each isolate, including European Nucleotide Archive accession numbers and in-silico Multi Locus Sequence Type.

    The archive “BeONE_Lm_assemblies.zip” contains all the genome assemblies (.fasta format) of each isolate presented in the metadata file.

    Dataset selection and curation

    This anonymized dataset of L. monocytogenes genome assemblies was generated using Next Generation Sequencing data collected within the BeONE Consortium available at the European Nucleotide Archive under BioProject Accession Number PRJEB57166. Read quality control, trimming and assembly were performed with Aquamis v1.3.9 (Deneke et al. 2021) using default parameters. Assembly quality control (QC), including contamination assessment, as well as MLST ST determination were performed with the same pipeline. All genome assemblies passing the QC were included in the final dataset. Among the others, we noticed that a considerable proportion of assemblies was flagged as “QC fail” exclusively due to the “NumContamSNVs” parameter, suggesting that this setting might have been too strict. After manual inspection of a random subset, assemblies for which the percentage of reads corresponding to the correct species was >98% were recovered and integrated in the final dataset (those samples are labeled in the Metadata file). In total, 1,426 isolates passed the dataset curation step and were included in the final dataset.

    Funding

    This work was supported by funding from the European Union’s Horizon 2020 Research and Innovation programme under grant agreement No 773830: One Health European Joint Programme.

  10. f

    EukRef: Phylogenetic curation of ribosomal RNA to enhance understanding of...

    • figshare.com
    • datasetcatalog.nlm.nih.gov
    xlsx
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Javier del Campo; Martin Kolisko; Vittorio Boscaro; Luciana F. Santoferrara; Serafim Nenarokov; Ramon Massana; Laure Guillou; Alastair Simpson; Cedric Berney; Colomban de Vargas; Matthew W. Brown; Patrick J. Keeling; Laura Wegener Parfrey (2023). EukRef: Phylogenetic curation of ribosomal RNA to enhance understanding of eukaryotic diversity and distribution [Dataset]. http://doi.org/10.1371/journal.pbio.2005849
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOS Biology
    Authors
    Javier del Campo; Martin Kolisko; Vittorio Boscaro; Luciana F. Santoferrara; Serafim Nenarokov; Ramon Massana; Laure Guillou; Alastair Simpson; Cedric Berney; Colomban de Vargas; Matthew W. Brown; Patrick J. Keeling; Laura Wegener Parfrey
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Environmental sequencing has greatly expanded our knowledge of micro-eukaryotic diversity and ecology by revealing previously unknown lineages and their distribution. However, the value of these data is critically dependent on the quality of the reference databases used to assign an identity to environmental sequences. Existing databases contain errors and struggle to keep pace with rapidly changing eukaryotic taxonomy, the influx of novel diversity, and computational challenges related to assembling the high-quality alignments and trees needed for accurate characterization of lineage diversity. EukRef (eukref.org) is an ongoing community-driven initiative that addresses these challenges by bringing together taxonomists with expertise spanning the eukaryotic tree of life and microbial ecologists, who use environmental sequence data to develop reliable reference databases across the diversity of microbial eukaryotes. EukRef organizes and facilitates rigorous mining and annotation of sequence data by providing protocols, guidelines, and tools. The EukRef pipeline and tools allow users interested in a particular group of microbial eukaryotes to retrieve all sequences belonging to that group from International Nucleotide Sequence Database Collaboration (INSDC) (GenBank, the European Nucleotide Archive [ENA], or the DNA DataBank of Japan [DDBJ]), to place those sequences in a phylogenetic tree, and to curate taxonomic and environmental information for the group. We provide guidelines to facilitate the process and to standardize taxonomic annotations. The final outputs of this process are (1) a reference tree and alignment, (2) a reference sequence database, including taxonomic and environmental information, and (3) a list of putative chimeras and other artifactual sequences. These products will be useful for the broad community as they become publicly available (at eukref.org) and are shared with existing reference databases.

  11. E

    SUPERSEDED - Human, yeast and pig genomics: sequence submissions and first...

    • find.data.gov.scot
    • dtechtive.com
    csv, txt
    Updated Jun 6, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Science, Technology and Innovation Studies. University of Edinburgh (2018). SUPERSEDED - Human, yeast and pig genomics: sequence submissions and first sequence descriptions in the literature (1980-2015) [Dataset]. http://doi.org/10.7488/ds/2358
    Explore at:
    csv(0.1938 MB), csv(11.83 MB), csv(0.8656 MB), csv(0.056 MB), csv(0.5938 MB), csv(1.918 MB), txt(0.0166 MB)Available download formats
    Dataset updated
    Jun 6, 2018
    Dataset provided by
    Science, Technology and Innovation Studies. University of Edinburgh
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This item has been replaced by the one which can be found at https://doi.org/10.7488/ds/2589 ## This data collection is derived from two sources: 1) Submissions of DNA sequences of S. cerevisiae (yeast), Sus scrofa (pig) and Homo sapiens (human) to the European Nucleotide Archive, and 2) First description of these sequences in the scientific literature. The time range of the records is 1980-2000 (yeast), 1985-2005 (human) and 1990-2015 (pig). In total, each species has two associated datasets: 1) A .csv file documenting the PubMed ID of each article describing new sequences, all paper authors, all institutional affiliations of each author, country of institution, year of first submission to the European Nucleotide Archive, and the year of article publication, and 2) A .csv file documenting all institutions submitting to the European Nucleotide Archive, number of nucleotides sequenced, number of submissions per institution, and year of submission to the database. The approximate number of records is 28,000 publications and over 2 million sequence submissions. Some data about submitting institutions is not fully cleaned.

  12. The OHEJP BeONE Project – Salmonella enterica genome assembly dataset

    • data.europa.eu
    • data.niaid.nih.gov
    • +1more
    unknown
    Updated Jul 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo (2025). The OHEJP BeONE Project – Salmonella enterica genome assembly dataset [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-7802723?locale=pl
    Explore at:
    unknown(189629)Available download formats
    Dataset updated
    Jul 3, 2025
    Dataset authored and provided by
    Zenodohttp://zenodo.org/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset This dataset comprises the genome assemblies of 1,540 Salmonella enterica samples collected by the BeONE Consortium on behalf of the One Health European Joint Programme “BeONE: Building Integrative Tools for One Health Surveillance” (https://onehealthejp.eu/jrp-beone/). Additionally, a complementary dataset is also made available (https://zenodo.org/record/7119735), comprising genome assemblies of 1,434 S. enterica samples selected among the Whole-Genome Sequencing (WGS) data publicly available in the European Nucleotide Archive (ENA) or in the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA). File “BeONE_Se_metadata.xlsx” contains the genome assembly statistics for each isolate, including European Nucleotide Archive accession numbers, in-silico Multi Locus Sequence Type and Serotype, and information regarding year of sampling, country and source. The archive “BeONE_Se_assemblies.zip” contains all the genome assemblies (.fasta format) of each isolate presented in the metadata file. Dataset selection and curation This anonymized dataset of S. enterica genome assemblies was generated using Next Generation Sequencing data collected within the BeONE Consortium available at the European Nucleotide Archive under BioProject Accession Number PRJEB57179. Read quality control, trimming and assembly were performed with Aquamis v1.3.9 (Deneke et al. 2021) using default parameters. Assembly quality control (QC), including contamination assessment, as well as MLST ST determination were performed with the same pipeline. All genome assemblies passing the QC were included in the final dataset. Among the others, we noticed that a considerable proportion of assemblies was flagged as “QC fail” exclusively due to the “NumContamSNVs” parameter, suggesting that this setting might have been too strict. After manual inspection of a random subset, assemblies for which the percentage of reads corresponding to the correct species was >98% were recovered and integrated in the final dataset (those samples are labeled in the Metadata file). In total, 1,540 isolates passed the dataset curation step and were included in the final dataset. In-silico serotyping was performed with SeqSero2 v1.2.1 (Zhang et al. 2019). Funding This work was supported by funding from the European Union’s Horizon 2020 Research and Innovation programme under grant agreement No 773830: One Health European Joint Programme. Acknowledgements We thank the National Distributed Computing Infrastructure of Portugal (INCD) for providing the necessary resources to run the genome assemblies. INCD was funded by FCT and FEDER under the project 22153-01/SAICT/2016.

  13. r

    Nucleotide (DNA / RNA) and Protein sequences from the Australian research...

    • researchdata.edu.au
    Updated Jul 20, 2012
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    QFAB Bioinformatics (2012). Nucleotide (DNA / RNA) and Protein sequences from the Australian research institution University of Notre Dame [Dataset]. https://researchdata.edu.au/nucleotide-dna-rna-notre-dame/56616
    Explore at:
    Dataset updated
    Jul 20, 2012
    Dataset provided by
    QFAB
    Authors
    QFAB Bioinformatics
    Description

    This data collection contains all currently published nucleotide (DNA/RNA) and protein sequences from the Australian research institution University of Notre Dame.The nucleotide (DNA/RNA) and protein sequences have been sourced through the European Nucleotide Archive (ENA) and Universal Protein Resource (UniProt), databases that contains comprehensive sets of nucleotide (DNA/RNA) and protein sequences from all organisms that have been published by the International Research Community.

  14. MOESM4 of ExpressionPlot: a web-based framework for analysis of RNA-Seq and...

    • springernature.figshare.com
    • datasetcatalog.nlm.nih.gov
    zip
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brad Friedman; Tom Maniatis (2023). MOESM4 of ExpressionPlot: a web-based framework for analysis of RNA-Seq and microarray gene expression data [Dataset]. http://doi.org/10.6084/m9.figshare.10035281.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Brad Friedman; Tom Maniatis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Additional file 4: Archival copy of software. (ZIP 3 MB)

  15. f

    Overview of the bioinformatic steps involved in submitting novel DNA...

    • plos.figshare.com
    xls
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michael Gruenstaeudl; Yannick Hartmaring (2023). Overview of the bioinformatic steps involved in submitting novel DNA sequence data to ENA when using EMBL2checklists, starting from assembled DNA sequences. [Dataset]. http://doi.org/10.1371/journal.pone.0210347.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Michael Gruenstaeudl; Yannick Hartmaring
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Overview of the bioinformatic steps involved in submitting novel DNA sequence data to ENA when using EMBL2checklists, starting from assembled DNA sequences.

  16. n

    BioSample Database at EBI

    • neuinfo.org
    • scicrunch.org
    • +2more
    Updated Jan 29, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). BioSample Database at EBI [Dataset]. http://identifiers.org/RRID:SCR_004856
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    Database that aggregates sample information for reference samples (e.g. Coriell Cell lines) and samples for which data exist in one of the EBI''''s assay databases such as ArrayExpress, the European Nucleotide Archive or PRoteomics Identificates DatabasE. It provides links to assays for specific samples, and accepts direct submissions of sample information. The goals of the BioSample Database include: # recording and linking of sample information consistently within EBI databases such as ENA, ArrayExpress and PRIDE; # minimizing data entry efforts for EBI database submitters by enabling submitting sample descriptions once and referencing them later in data submissions to assay databases and # supporting cross database queries by sample characteristics. The database includes a growing set of reference samples, such as cell lines, which are repeatedly used in experiments and can be easily referenced from any database by their accession numbers. Accession numbers for the reference samples will be exchanged with a similar database at NCBI. The samples in the database can be queried by their attributes, such as sample types, disease names or sample providers. A simple tab-delimited format facilitates submissions of sample information to the database, initially via email to biosamples (at) ebi.ac.uk. Current data sources: * European Nucleotide Archive (424,811 samples) * PRIDE (17,001 samples) * ArrayExpress (1,187,884 samples) * ENCODE cell lines (119 samples) * CORIELL cell lines (27,002 samples) * Thousand Genome (2,628 samples) * HapMap (1,417 samples) * IMSR (248,660 samples)

  17. The OHEJP BeONE Project – Campylobacter jejuni genome assembly dataset

    • data.europa.eu
    • data.niaid.nih.gov
    • +1more
    unknown
    Updated May 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo (2024). The OHEJP BeONE Project – Campylobacter jejuni genome assembly dataset [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-7802717?locale=fr
    Explore at:
    unknown(74818)Available download formats
    Dataset updated
    May 16, 2024
    Dataset authored and provided by
    Zenodohttp://zenodo.org/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset This dataset comprises the genome assemblies of 610 Campylobacter jejuni samples collected by the BeONE Consortium on behalf of the One Health European Joint Programme “BeONE: Building Integrative Tools for One Health Surveillance” (https://onehealthejp.eu/jrp-beone/). Additionally, a complementary dataset is also made available (https://zenodo.org/record/7120166), comprising genome assemblies of 3,076 C. jejuni samples selected among the Whole-Genome Sequencing (WGS) data publicly available in the European Nucleotide Archive (ENA) or in the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA). File “BeONE_Cj_metadata.xlsx” contains the genome assembly statistics for each isolate, including European Nucleotide Archive accession numbers and in-silico Multi Locus Sequence Type, and information regarding year of sampling, country and source. The archive “BeONE_Cj_assemblies.zip” contains all the genome assemblies (.fasta format) of each isolate presented in the metadata file. Dataset selection and curation This anonymized dataset of C. jejuni genome assemblies was generated using Next Generation Sequencing data collected within the BeONE Consortium available at the European Nucleotide Archive under BioProject Accession Number PRJEB57119. Read quality control, trimming and assembly were performed with Aquamis v1.3.9 (Deneke et al. 2021) using default parameters. Assembly quality control (QC), including contamination assessment, as well as MLST ST determination were performed with the same pipeline. All genome assemblies passing the QC were included in the final dataset. Among the others, we noticed that a considerable proportion of assemblies was flagged as “QC fail” exclusively due to the “NumContamSNVs” parameter, suggesting that this setting might have been too strict. After manual inspection of a random subset, assemblies for which the percentage of reads corresponding to the correct species was >98% were recovered and integrated in the final dataset (those samples are labeled in the Metadata file). In total, 610 isolates passed the dataset curation step and were included in the final dataset. Funding This work was supported by funding from the European Union’s Horizon 2020 Research and Innovation programme under grant agreement No 773830: One Health European Joint Programme. Acknowledgements We thank the National Distributed Computing Infrastructure of Portugal (INCD) for providing the necessary resources to run the genome assemblies. INCD was funded by FCT and FEDER under the project 22153-01/SAICT/2016.

  18. r

    Nucleotide (DNA / RNA) and Protein sequences from the Australian research...

    • researchdata.edu.au
    Updated Jul 20, 2012
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    QFAB Bioinformatics (2012). Nucleotide (DNA / RNA) and Protein sequences from the Australian research institution Western Health [Dataset]. https://researchdata.edu.au/nucleotide-dna-rna-western-health/53125
    Explore at:
    Dataset updated
    Jul 20, 2012
    Dataset provided by
    QFAB
    Authors
    QFAB Bioinformatics
    Area covered
    Australia
    Description

    This data collection contains all currently published nucleotide (DNA/RNA) and protein sequences from the Australian research institution Western Health.The nucleotide (DNA/RNA) and protein sequences have been sourced through the European Nucleotide Archive (ENA) and Universal Protein Resource (UniProt), databases that contains comprehensive sets of nucleotide (DNA/RNA) and protein sequences from all organisms that have been published by the International Research Community.

  19. MOESM3 of ExpressionPlot: a web-based framework for analysis of RNA-Seq and...

    • springernature.figshare.com
    zip
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brad Friedman; Tom Maniatis (2023). MOESM3 of ExpressionPlot: a web-based framework for analysis of RNA-Seq and microarray gene expression data [Dataset]. http://doi.org/10.6084/m9.figshare.10035278.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Brad Friedman; Tom Maniatis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Additional file 3: Data for Figure 2d. (ZIP 6 MB)

  20. r

    Nucleotide (DNA / RNA) and Protein sequences from the Australian dwelling...

    • researchdata.edu.au
    Updated Jul 20, 2012
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    QFAB Bioinformatics (2012). Nucleotide (DNA / RNA) and Protein sequences from the Australian dwelling species Pityrodia uncinata [Dataset]. https://researchdata.edu.au/nucleotide-dna-rna-pityrodia-uncinata/48964
    Explore at:
    Dataset updated
    Jul 20, 2012
    Dataset provided by
    QFAB
    Authors
    QFAB Bioinformatics
    Description

    This data collection contains all currently published nucleotide (DNA/RNA) and protein sequences from the Australian dwelling organism Pityrodia uncinata.

    The nucleotide (DNA/RNA) and protein sequences have been sourced through the European Nucleotide Archive (ENA) and Universal Protein Resource (UniProt), databases that contains comprehensive sets of nucleotide (DNA/RNA) and protein sequences from all organisms that have been published by the International Research Community.

    The identification of the species Pityrodia uncinata as an Australian dwelling organism has been achieved by accessing the Australian Plant Census (APC) or Australian Faunal Directory (AFD) through the Atlas of Living Australia.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
U.S. EPA Office of Research and Development (ORD) (2023). The metagenome sequencing data have been deposited in the European Nucleotide Archive (ENA). [Dataset]. https://catalog.data.gov/dataset/the-metagenome-sequencing-data-have-been-deposited-in-the-european-nucleotide-archive-ena-8ad17
Organization logo

The metagenome sequencing data have been deposited in the European Nucleotide Archive (ENA).

Explore at:
Dataset updated
Mar 11, 2023
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description

The raw sequencing data for this study have been deposited in the European Nucleotide Archive (ENA) at EMBL-EBI under accession number PRJEB40814 with the following BioSample numbers: SAMEA7465213 (sample DWDS A1), SAMEA7465214 (DWDS A2), SAMEA7465217 (DWDS B1), SAMEA7465218 (DWDS B2), SAMEA7465220 (DWDS C1), SAMEA7465221 (DWDS C2), SAMEA7465222 (DWDS D1), SAMEA7465223 (DWDS D2), SAMEA7465226 (DWDS E1), and SAMEA7465227 (DWDS E2). This dataset is associated with the following publication: Gomez-Alvarez, V., S. Siponen, A. Kauppinen, A. Hokajarvi , A. Tiwari, A. Sarekoski, I.T. Miettinen, E. Torvinen, and T. Pitkanen. A comparative analysis employing a gene- and genome-centric metagenomic approach reveals changes in composition, function, and activity in waterworks with different treatment processes and source water in Finland. WATER RESEARCH. Elsevier Science Ltd, New York, NY, USA, 229: 119495, (2023).

Search
Clear search
Close search
Google apps
Main menu