100+ datasets found
  1. b

    European Nucleotide Archive

    • bioregistry.io
    Updated Apr 28, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). European Nucleotide Archive [Dataset]. http://identifiers.org/re3data:r3d100010527
    Explore at:
    Dataset updated
    Apr 28, 2021
    Description

    The European Nucleotide Archive (ENA) captures and presents information relating to experimental workflows that are based around nucleotide sequencing. ENA is made up of a number of distinct databases that includes EMBL-Bank, the Sequence Read Archive (SRA) and the Trace Archive each with their own data formats and standards. This collection references Embl-Bank identifiers.

  2. o

    Data from: Assembly information services in the European Nucleotide Archive....

    • omicsdi.org
    xml
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pakseresht N, Assembly information services in the European Nucleotide Archive. [Dataset]. https://www.omicsdi.org/dataset/biostudies/S-EPMC3965037
    Explore at:
    xmlAvailable download formats
    Authors
    Pakseresht N
    Variables measured
    Unknown
    Description

    The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) is a repository for the world public domain nucleotide sequence data output. ENA content covers a spectrum of data types including raw reads, assembly data and functional annotation. ENA has faced a dramatic growth in genome assembly submission rates, data volumes and complexity of datasets. This has prompted a broad reworking of assembly submission services, for which we now reach the end of a major programme of work and many enhancements have already been made available over the year to components of the submission service. In this article, we briefly review ENA content and growth over 2013, describe our rapidly developing services for genome assembly information and outline further major developments over the last year.

  3. The metagenome sequencing data have been deposited in the European...

    • catalog.data.gov
    • data.amerigeoss.org
    Updated Jun 18, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2022). The metagenome sequencing data have been deposited in the European Nucleotide Archive (ENA). [Dataset]. https://catalog.data.gov/dataset/the-metagenome-sequencing-data-have-been-deposited-in-the-european-nucleotide-archive-ena
    Explore at:
    Dataset updated
    Jun 18, 2022
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    The raw sequencing data for this study have been deposited in the European Nucleotide Archive (ENA) at EMBL-EBI under accession number PRJEB40814 with the following BioSample numbers: SAMEA7465213 (sample DWDS A1), SAMEA7465214 (DWDS A2), SAMEA7465217 (DWDS B1), SAMEA7465218 (DWDS B2), SAMEA7465220 (DWDS C1), SAMEA7465221 (DWDS C2), SAMEA7465222 (DWDS D1), SAMEA7465223 (DWDS D2), SAMEA7465226 (DWDS E1), and SAMEA7465226 (DWDS E2). This dataset is associated with the following publication: Tiwari, A., V. Gomez-Alvarez, S. Siponen, A. Sarekoski, A. Hokajärvi, A. Kauppinen, E. Torvinen, I.T. Miettinen, and T. Pitkänen. Bacterial Genes Encoding Resistance against Antibiotics and Metals in Well-Maintained Drinking Water Distribution Systems in Finland. Frontiers in Microbiology. Frontiers, Lausanne, SWITZERLAND, 12: 803094, (2022).

  4. European Nucleotide Archive (ENA)

    • covid19dataportal.org
    Updated Jan 1, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    INSDC (2021). European Nucleotide Archive (ENA) [Dataset]. https://www.covid19dataportal.org/sequences?db=sequences&query=ACE2&requestFrom=searchExample
    Explore at:
    Dataset updated
    Jan 1, 2021
    Dataset provided by
    International Nucleotide Sequence Database Collaboration
    Authors
    INSDC
    License

    https://www.ebi.ac.uk/ena/browser/about/policieshttps://www.ebi.ac.uk/ena/browser/about/policies

    Description

    The European Nucleotide Archive (ENA) provides a comprehensive record of the world’s nucleotide sequencing information, covering raw sequencing data, sequence assembly information and functional annotation.

  5. o

    The European Nucleotide Archive.

    • omicsdi.org
    xml
    Updated Jan 15, 2011
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Leinonen R (2011). The European Nucleotide Archive. [Dataset]. https://www.omicsdi.org/dataset/biostudies-literature/S-EPMC3013801
    Explore at:
    xmlAvailable download formats
    Dataset updated
    Jan 15, 2011
    Authors
    Leinonen R
    Variables measured
    Unknown
    Description

    The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) is Europe's primary nucleotide-sequence repository. The ENA consists of three main databases: the Sequence Read Archive (SRA), the Trace Archive and EMBL-Bank. The objective of ENA is to support and promote the use of nucleotide sequencing as an experimental research platform by providing data submission, archive, search and download services. In this article, we outline these services and describe major changes and improvements introduced during 2010. These include extended EMBL-Bank and SRA-data submission services, extended ENA Browser functionality, support for submitting data to the European Genome-phenome Archive (EGA) through SRA, and the launch of a new sequence similarity search service.

  6. s

    European Nucleotide Archive (ENA)

    • scicrunch.org
    Updated Mar 16, 2010
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2010). European Nucleotide Archive (ENA) [Dataset]. http://identifiers.org/RRID:SCR_006515
    Explore at:
    Dataset updated
    Mar 16, 2010
    Description

    Public archive providing a comprehensive record of the world''''s nucleotide sequencing information, covering raw sequencing data, sequence assembly information and functional annotation. All submitted data, once public, will be exchanged with the NCBI and DDBJ as part of the INSDC data exchange agreement. The European Nucleotide Archive (ENA) captures and presents information relating to experimental workflows that are based around nucleotide sequencing. A typical workflow includes the isolation and preparation of material for sequencing, a run of a sequencing machine in which sequencing data are produced and a subsequent bioinformatic analysis pipeline. ENA records this information in a data model that covers input information (sample, experimental setup, machine configuration), output machine data (sequence traces, reads and quality scores) and interpreted information (assembly, mapping, functional annotation). Data arrive at ENA from a variety of sources including submissions of raw data, assembled sequences and annotation from small-scale sequencing efforts, data provision from the major European sequencing centers and routine and comprehensive exchange with their partners in the International Nucleotide Sequence Database Collaboration (INSDC). Provision of nucleotide sequence data to ENA or its INSDC partners has become a central and mandatory step in the dissemination of research findings to the scientific community. ENA works with publishers of scientific literature and funding bodies to ensure compliance with these principles and to provide optimal submission systems and data access tools that work seamlessly with the published literature. ENA is made up of a number of distinct databases that includes the EMBL Nucleotide Sequence Database (Embl-Bank), the newly established Sequence Read Archive (SRA) and the Trace Archive. The main tool for downloading ENA data is the ENA Browser, which is available through REST URLs for easy programmatic use. All ENA data are available through the ENA Browser. Note: EMBL Nucleotide Sequence Database (EMBL-Bank) is entirely included within this resource.

  7. f

    FAIRsharing record for: European Nucleotide Archive

    • fairsharing.org
    Updated Nov 4, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2014). FAIRsharing record for: European Nucleotide Archive [Dataset]. http://doi.org/10.25504/FAIRsharing.dj8nt8
    Explore at:
    Dataset updated
    Nov 4, 2014
    Description

    This FAIRsharing record describes: The European Nucleotide Archive (ENA) is a globally comprehensive data resource for nucleotide sequence, spanning raw data, alignments and assemblies, functional and taxonomic annotation and rich contextual data relating to sequenced samples and experimental design. Serving both as the database of record for the output of the world's sequencing activity and as a platform for the management, sharing and publication of sequence data, the ENA provides a portfolio of services for submission, data management, search and retrieval across web and programmatic interfaces. The ENA is part of the International Nucleotide Sequence Database Collaboration (INSDC), which comprises the DNA DataBank of Japan (DDBJ), the European Molecular Biology Laboratory (EMBL), and GenBank at the NCBI. These three organizations exchange data on a daily basis.

  8. f

    European Nucleotide Archive accession numbers for all sequences generated...

    • figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gerrit Hartig; Ralph S. Peters; Janus Borner; Claudia Etzbauer; Bernhard Misof; Oliver Niehuis (2023). European Nucleotide Archive accession numbers for all sequences generated from primer testing. [Dataset]. http://doi.org/10.1371/journal.pone.0039826.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Gerrit Hartig; Ralph S. Peters; Janus Borner; Claudia Etzbauer; Bernhard Misof; Oliver Niehuis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    European Nucleotide Archive accession numbers for all sequences generated from primer testing.

  9. The metagenome sequencing data have been deposited in the European...

    • catalog.data.gov
    Updated Mar 11, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2023). The metagenome sequencing data have been deposited in the European Nucleotide Archive (ENA). [Dataset]. https://catalog.data.gov/dataset/the-metagenome-sequencing-data-have-been-deposited-in-the-european-nucleotide-archive-ena-8ad17
    Explore at:
    Dataset updated
    Mar 11, 2023
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    The raw sequencing data for this study have been deposited in the European Nucleotide Archive (ENA) at EMBL-EBI under accession number PRJEB40814 with the following BioSample numbers: SAMEA7465213 (sample DWDS A1), SAMEA7465214 (DWDS A2), SAMEA7465217 (DWDS B1), SAMEA7465218 (DWDS B2), SAMEA7465220 (DWDS C1), SAMEA7465221 (DWDS C2), SAMEA7465222 (DWDS D1), SAMEA7465223 (DWDS D2), SAMEA7465226 (DWDS E1), and SAMEA7465227 (DWDS E2). This dataset is associated with the following publication: Gomez-Alvarez, V., S. Siponen, A. Kauppinen, A. Hokajarvi , A. Tiwari, A. Sarekoski, I.T. Miettinen, E. Torvinen, and T. Pitkanen. A comparative analysis employing a gene- and genome-centric metagenomic approach reveals changes in composition, function, and activity in waterworks with different treatment processes and source water in Finland. WATER RESEARCH. Elsevier Science Ltd, New York, NY, USA, 229: 119495, (2023).

  10. i

    Quantitative monitoring of nucleotide information from genetic resources in...

    • doi.ipk-gatersleben.de
    Updated Apr 16, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Guy Cochrane; Blaise Alako; Matthias Lange; Mehmood Ghaffar; Jens Freitag; Amber Scholz; Upneet Hillebrand; Guy Cochrane; Blaise Alako (2021). Quantitative monitoring of nucleotide information from genetic resources in context of their citation in the scientific literature [Dataset]. https://doi.ipk-gatersleben.de/DOI/8d5e2634-88ac-4f0f-9859-cb2006091775/a6ca2009-3cce-4dc4-bdf0-c7cb54f5156f/2
    Explore at:
    Dataset updated
    Apr 16, 2021
    Dataset provided by
    e!DAL - Plant Genomics and Phenomics Research Data Repository (PGP), IPK Gatersleben, Seeland OT Gatersleben, Corrensstraße 3, 06466, Germany
    Authors
    Guy Cochrane; Blaise Alako; Matthias Lange; Mehmood Ghaffar; Jens Freitag; Amber Scholz; Upneet Hillebrand; Guy Cochrane; Blaise Alako
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data set comprise extracted and linked records of the European Nucleotide Archive to citations in open-access publications that aggregated at Europe PubMed Central. Doing so, ENA records were parsed and filtered for valid country tag and fed into ePMC RestFull API to extract matching secondary publication by ENA accession or project accession numbers. The resulting data sets are normalized as tables ENA_SEQUENCES, PMC_REFERENCES alongside a curated list of world's countries in table CONTRIES and economics groups in table COUNTRY2GRP. This tables are the basis for a data warehouse and a web application It enables to join literature and sequence databases in multidimensional fashion. A concrete use case in the context of the United Nations convention on Biological Diversity is the analysis of countries in respect of nucleotide sequence use and contribution.

  11. o

    Data from: EMBLmyGFF3: a converter facilitating genome annotation submission...

    • omicsdi.org
    xml
    Updated Aug 16, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Norling M (2018). EMBLmyGFF3: a converter facilitating genome annotation submission to European Nucleotide Archive. [Dataset]. https://www.omicsdi.org/dataset/biostudies-literature/S-EPMC6090716
    Explore at:
    xmlAvailable download formats
    Dataset updated
    Aug 16, 2018
    Authors
    Norling M
    Variables measured
    Unknown
    Description

    OBJECTIVE:The state-of-the-art genome annotation tools output GFF3 format files, while this format is not accepted as submission format by the International Nucleotide Sequence Database Collaboration (INSDC) databases. Converting the GFF3 format to a format accepted by one of the three INSDC databases is a key step in the achievement of genome annotation projects. However, the flexibility existing in the GFF3 format makes this conversion task difficult to perform. Until now, no converter is able to handle any GFF3 flavour regardless of source. RESULTS:Here we present EMBLmyGFF3, a robust universal converter from GFF3 format to EMBL format compatible with genome annotation submission to the European Nucleotide Archive. The tool uses json parameter files, which can be easily tuned by the user, allowing the mapping of corresponding vocabulary between the GFF3 format and the EMBL format. We demonstrate the conversion of GFF3 annotation files from four different commonly used annotation tools: Maker, Prokka, Augustus and Eugene. EMBLmyGFF3 is freely available at https://github.com/NBISweden/EMBLmyGFF3 .

  12. m

    Modal

    • data.mendeley.com
    Updated Feb 5, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Radu Constantin Parpala (2020). Modal [Dataset]. http://doi.org/10.17632/93j24wpf55.1
    Explore at:
    Dataset updated
    Feb 5, 2020
    Authors
    Radu Constantin Parpala
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Ansys archive file

  13. d

    Data from: BioProject.

    • datadiscoverystudio.org
    Updated Jul 14, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2017). BioProject. [Dataset]. http://datadiscoverystudio.org/geoportal/rest/metadata/item/59ce5dee4011447b946870a0c1d89424/html
    Explore at:
    Dataset updated
    Jul 14, 2017
    Description

    description:

    The BioProject database provides an organizational framework to access information about research projects with links to data that have been or will be deposited into archival databases maintained at members of the International Nucleotide Sequence Database Consortium (INSDC, which comprises the DNA DataBank of Japan (DDBJ), the European Nucleotide Archive at European Molecular Biology Laboratory (ENA), and GenBank at the National Center for Biotechnology Information (NCBI)).

    ; abstract:

    The BioProject database provides an organizational framework to access information about research projects with links to data that have been or will be deposited into archival databases maintained at members of the International Nucleotide Sequence Database Consortium (INSDC, which comprises the DNA DataBank of Japan (DDBJ), the European Nucleotide Archive at European Molecular Biology Laboratory (ENA), and GenBank at the National Center for Biotechnology Information (NCBI)).

  14. of ExpressionPlot: a web-based framework for analysis of RNA-Seq and...

    • springernature.figshare.com
    zip
    Updated May 31, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brad Friedman; Tom Maniatis (2023). of ExpressionPlot: a web-based framework for analysis of RNA-Seq and microarray gene expression data [Dataset]. http://doi.org/10.6084/m9.figshare.10035245.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Brad Friedman; Tom Maniatis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    of ExpressionPlot: a web-based framework for analysis of RNA-Seq and microarray gene expression data

  15. g

    Supporting data for "Quantitative monitoring of nucleotide sequence data...

    • gigadb.org
    Updated Nov 11, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). Supporting data for "Quantitative monitoring of nucleotide sequence data from genetic resources in context of their citation in the scientific literature" [Dataset]. http://doi.org/10.5524/100947
    Explore at:
    Dataset updated
    Nov 11, 2021
    Description

    Linking nucleotide sequence data (NSD) to scientific publication citations can enhance understanding of NSDs provenance, scientific use, and reuse in the community. By connecting publications with NSD records, NSD geographical provenance information, and author geographical information, it becomes possible to assess the contribution of NSD to infer trends in scientific knowledge gain at the global level.
    We extracted and linked records from the European Nucleotide Archive to citations in open-access publications aggregated at Europe PubMed Central for this data note. A total of 8,464,292 ENA accessions with geographical provenance information were associated with publications. We conducted a data quality review to uncover potential issues in publication citation information extraction and author affiliation tagging and developed and implemented best-practice recommendations for citation extraction. We constructed flat data tables and a data warehouse with an interactive web application to enable ad hoc exploration of NSD use and summary statistics.
    The extraction and linking of NSD with associated publication citations enables transparency. The quality review contributes to enhanced text mining methods for identifier extraction and use. Furthermore, the global provision and use of NSD enable scientists worldwide to join literature and sequence databases in a multidimensional fashion. As a concrete use case, we visualized statistics of country clusters concerning NSD access in the context of discussions around digital sequence information under the United Nations Convention on Biological Diversity.

  16. The OHEJP BeONE Project – Listeria monocytogenes genome assembly dataset

    • zenodo.org
    • openagrar.de
    bin, zip
    Updated Jul 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Verónica Mixão; Verónica Mixão; Miguel Pinto; Miguel Pinto; João Paulo Gomes; João Paulo Gomes; Daniel Sobral; Daniel Sobral; Holger Brendebach; Holger Brendebach; Carlus Deneke; Carlus Deneke; Simon Tausch; Simon Tausch; Adriano Di Pasquale; Adriano Di Pasquale; Claudia Swart-Coipan; Claudia Swart-Coipan; Ewelina Iwan; Jörg Linde; Jörg Linde; Karin Lagesen; Karin Lagesen; Liljana Petrovska; Liljana Petrovska; Mohammed Umaer Naseer; Rolf Sommer Kaas; Rolf Sommer Kaas; Sandra Simon; Katrine Joensen; Katrine Joensen; Kristoffer Kiil; Sofie Nielsen; Sofie Nielsen; Vítor Borges; Vítor Borges; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI; Ewelina Iwan; Mohammed Umaer Naseer; Sandra Simon; Kristoffer Kiil; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI (2023). The OHEJP BeONE Project – Listeria monocytogenes genome assembly dataset [Dataset]. http://doi.org/10.5281/zenodo.7267487
    Explore at:
    bin, zipAvailable download formats
    Dataset updated
    Jul 24, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Verónica Mixão; Verónica Mixão; Miguel Pinto; Miguel Pinto; João Paulo Gomes; João Paulo Gomes; Daniel Sobral; Daniel Sobral; Holger Brendebach; Holger Brendebach; Carlus Deneke; Carlus Deneke; Simon Tausch; Simon Tausch; Adriano Di Pasquale; Adriano Di Pasquale; Claudia Swart-Coipan; Claudia Swart-Coipan; Ewelina Iwan; Jörg Linde; Jörg Linde; Karin Lagesen; Karin Lagesen; Liljana Petrovska; Liljana Petrovska; Mohammed Umaer Naseer; Rolf Sommer Kaas; Rolf Sommer Kaas; Sandra Simon; Katrine Joensen; Katrine Joensen; Kristoffer Kiil; Sofie Nielsen; Sofie Nielsen; Vítor Borges; Vítor Borges; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI; Ewelina Iwan; Mohammed Umaer Naseer; Sandra Simon; Kristoffer Kiil; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset

    This dataset comprises the genome assemblies of 1,426 Listeria monocytogenes samples collected by the BeONE Consortium on behalf of the One Health European Joint Programme “BeONE: Building Integrative Tools for One Health Surveillance” (https://onehealthejp.eu/jrp-beone/). Additionally, a complementary dataset is also made available (https://zenodo.org/record/7116878), comprising genome assemblies of 1,874 L. monocytogenes samples selected among the Whole-Genome Sequencing (WGS) data publicly available in the European Nucleotide Archive (ENA) or in the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA).

    File “BeONE_Lm_metadata.xlsx” contains the genome assembly statistics for each isolate, including European Nucleotide Archive accession numbers and in-silico Multi Locus Sequence Type.

    The archive “BeONE_Lm_assemblies.zip” contains all the genome assemblies (.fasta format) of each isolate presented in the metadata file.

    Dataset selection and curation

    This anonymized dataset of L. monocytogenes genome assemblies was generated using Next Generation Sequencing data collected within the BeONE Consortium available at the European Nucleotide Archive under BioProject Accession Number PRJEB57166. Read quality control, trimming and assembly were performed with Aquamis v1.3.9 (Deneke et al. 2021) using default parameters. Assembly quality control (QC), including contamination assessment, as well as MLST ST determination were performed with the same pipeline. All genome assemblies passing the QC were included in the final dataset. Among the others, we noticed that a considerable proportion of assemblies was flagged as “QC fail” exclusively due to the “NumContamSNVs” parameter, suggesting that this setting might have been too strict. After manual inspection of a random subset, assemblies for which the percentage of reads corresponding to the correct species was >98% were recovered and integrated in the final dataset (those samples are labeled in the Metadata file). In total, 1,426 isolates passed the dataset curation step and were included in the final dataset.

    Funding

    This work was supported by funding from the European Union’s Horizon 2020 Research and Innovation programme under grant agreement No 773830: One Health European Joint Programme.

  17. f

    FAIRsharing record for: Short Read Archive eXtensible Markup Language

    • fairsharing.org
    Updated Nov 4, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2014). FAIRsharing record for: Short Read Archive eXtensible Markup Language [Dataset]. http://doi.org/10.25504/FAIRsharing.q72e3w
    Explore at:
    Dataset updated
    Nov 4, 2014
    Description

    This FAIRsharing record describes: The SRA data model contains the following objects: Study: information about the sequencing project Sample: information about the sequenced samples Experiment: information about the libraries, platform; associated with study, sample(s) and run(s) Run: contains the raw data files Analysis: contains the analysis data files; associated with study, sample and run objects Submission: information about the submission actions include release date. It is used by The Sequence Read Archive (SRA) and the European Nucleotide Archive (ENA) to store raw sequencing data from the next generation of sequencing platforms including Roche 454 GS System, Illumina Genome Analyzer, Applied Biosystems SOLiD System, Helicos Heliscope, Complete Genomics, and Pacific Biosciences SMRT. Please note that the SRA-XML webpage states that "This page has been deprecated and may be removed without further notice."

  18. Genome-wide haplotype counting in diatoms S. robusta and P. tricornutum -...

    • zenodo.org
    bin
    Updated Aug 29, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Petra Bulankova; Petra Bulankova (2020). Genome-wide haplotype counting in diatoms S. robusta and P. tricornutum - processed datasets [Dataset]. http://doi.org/10.5281/zenodo.4005721
    Explore at:
    binAvailable download formats
    Dataset updated
    Aug 29, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Petra Bulankova; Petra Bulankova
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Datasets used for genome-wide haplotype counting in S. robusta and P. tricornutum next-generation sequencing data.

    For S. robusta and P. tricornutum genome-wide haplotype counting, reliable SNPs set was first identified in ILLUMINA short-read sequencing datasets and then used for counting of the number of haplotypes in the PacBio RS II and MinION long reads. The ILLUMINA and PacBio data of S. robusta were downloaded from https://www.ebi.ac.uk/ena/browser/view/PRJEB36614 and ILLUMINA and Minion data of P. tricornutum were downloaded from https://www.ebi.ac.uk/ena/browser/view/PRJNA487263.

    Reference genome assembly for S. robusta: CAICTM010000001-CAICTM010004752 (European Nucleotide Archive) and

    Reference genome assembly for P. tricornutum: GCA_000150955.2 (European Nucleotide Archive).

    Uploaded files:

    -.bam files containing S. robusta PacBio self-corrected CCS reads aligned to reference and processed and P. tricornutum self-corrected MinION reads aligned to reference and processed:

    S_robusta_aligned_corrected_PacBio_reads.bam

    P_tricornutum_aligned_corrected_MinION_reads.bam

    -.table files with selected reliable SNPs used for haplotype counting with CHROM, POSITION, REFERENCE and ALTERNATIVE allele

    S_robusta_SNPs.table

    P_tricornutum_SNPs.table

  19. The OHEJP BeONE Project – Escherichia coli genome assembly dataset

    • zenodo.org
    bin, zip
    Updated Jul 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Verónica Mixão; Verónica Mixão; Miguel Pinto; Miguel Pinto; João Paulo Gomes; João Paulo Gomes; Daniel Sobral; Daniel Sobral; Holger Brendebach; Holger Brendebach; Carlus Deneke; Carlus Deneke; Simon Tausch; Simon Tausch; Adriano Di Pasquale; Adriano Di Pasquale; Claudia Swart-Coipan; Claudia Swart-Coipan; Ewelina Iwan; Jörg Linde; Jörg Linde; Karin Lagesen; Karin Lagesen; Liljana Petrovska; Liljana Petrovska; Mohammed Umaer Naseer; Rolf Sommer Kaas; Rolf Sommer Kaas; Sandra Simon; Katrine Joensen; Katrine Joensen; Kristoffer Kiil; Sofie Nielsen; Sofie Nielsen; Vítor Borges; Vítor Borges; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI; Ewelina Iwan; Mohammed Umaer Naseer; Sandra Simon; Kristoffer Kiil; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI (2023). The OHEJP BeONE Project – Escherichia coli genome assembly dataset [Dataset]. http://doi.org/10.5281/zenodo.7267845
    Explore at:
    zip, binAvailable download formats
    Dataset updated
    Jul 24, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Verónica Mixão; Verónica Mixão; Miguel Pinto; Miguel Pinto; João Paulo Gomes; João Paulo Gomes; Daniel Sobral; Daniel Sobral; Holger Brendebach; Holger Brendebach; Carlus Deneke; Carlus Deneke; Simon Tausch; Simon Tausch; Adriano Di Pasquale; Adriano Di Pasquale; Claudia Swart-Coipan; Claudia Swart-Coipan; Ewelina Iwan; Jörg Linde; Jörg Linde; Karin Lagesen; Karin Lagesen; Liljana Petrovska; Liljana Petrovska; Mohammed Umaer Naseer; Rolf Sommer Kaas; Rolf Sommer Kaas; Sandra Simon; Katrine Joensen; Katrine Joensen; Kristoffer Kiil; Sofie Nielsen; Sofie Nielsen; Vítor Borges; Vítor Borges; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI; Ewelina Iwan; Mohammed Umaer Naseer; Sandra Simon; Kristoffer Kiil; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset

    This dataset comprises the genome assemblies of 308 Escherichia coli samples collected by the BeONE Consortium on behalf of the One Health European Joint Programme “BeONE: Building Integrative Tools for One Health Surveillance” (https://onehealthejp.eu/jrp-beone/). Additionally, a complementary dataset is also made available (https://zenodo.org/record/7120057), comprising genome assemblies of 1,999 E. coli samples selected among the Whole-Genome Sequencing (WGS) data publicly available in the European Nucleotide Archive (ENA) or in the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA).

    File “BeONE_Ec_metadata.xlsx” contains the genome assembly statistics for each isolate, including European Nucleotide Archive accession numbers, in-silico Multi Locus Sequence Type and Serotype.

    The archive “BeONE_Ec_assemblies.zip” contains all the genome assemblies (.fasta format) of each isolate presented in the metadata file.

    Dataset selection and curation

    This anonymized dataset of E. coli genome assemblies was generated using Next Generation Sequencing data collected within the BeONE Consortium available at the European Nucleotide Archive under BioProject Accession Number PRJEB57098. Read quality control, trimming and assembly were performed with Aquamis v1.3.9 (Deneke et al. 2021) using default parameters. Assembly quality control (QC), including contamination assessment, as well as MLST ST determination were performed with the same pipeline. All genome assemblies passing the QC were included in the final dataset. Among the others, we noticed that a considerable proportion of assemblies was flagged as “QC fail” exclusively due to the “NumContamSNVs” parameter, suggesting that this setting might have been too strict. After manual inspection of a random subset, assemblies for which the percentage of reads corresponding to the correct species was >98% were recovered and integrated in the final dataset (those samples are labeled in the Metadata file). In total, 308 isolates passed the dataset curation step and were included in the final dataset. In-silico serotyping was performed with seq_typing v2.2.

    Funding

    This work was supported by funding from the European Union’s Horizon 2020 Research and Innovation programme under grant agreement No 773830: One Health European Joint Programme.

  20. r

    Nucleotide (DNA / RNA) and Protein sequences from the Australian research...

    • researchdata.edu.au
    Updated Jul 20, 2012
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    QFAB Bioinformatics (2012). Nucleotide (DNA / RNA) and Protein sequences from the Australian research institution CSIRO Entomology [Dataset]. https://researchdata.edu.au/nucleotide-dna-rna-csiro-entomology/79031
    Explore at:
    Dataset updated
    Jul 20, 2012
    Dataset provided by
    QFAB
    Authors
    QFAB Bioinformatics
    Description

    This data collection contains all currently published nucleotide (DNA/RNA) and protein sequences from the Australian research institution CSIRO Entomology.The nucleotide (DNA/RNA) and protein sequences have been sourced through the European Nucleotide Archive (ENA) and Universal Protein Resource (UniProt), databases that contains comprehensive sets of nucleotide (DNA/RNA) and protein sequences from all organisms that have been published by the International Research Community.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2021). European Nucleotide Archive [Dataset]. http://identifiers.org/re3data:r3d100010527

European Nucleotide Archive

Explore at:
Dataset updated
Apr 28, 2021
Description

The European Nucleotide Archive (ENA) captures and presents information relating to experimental workflows that are based around nucleotide sequencing. ENA is made up of a number of distinct databases that includes EMBL-Bank, the Sequence Read Archive (SRA) and the Trace Archive each with their own data formats and standards. This collection references Embl-Bank identifiers.

Search
Clear search
Close search
Google apps
Main menu