100+ datasets found

The metagenome sequencing data have been deposited in the European...
catalog.data.gov
datasets.ai
Updated Mar 11, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Research and Development (ORD) (2023). The metagenome sequencing data have been deposited in the European Nucleotide Archive (ENA). [Dataset]. https://catalog.data.gov/dataset/the-metagenome-sequencing-data-have-been-deposited-in-the-european-nucleotide-archive-ena-8ad17
Explore at:
Dataset updated
Mar 11, 2023
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description
The raw sequencing data for this study have been deposited in the European Nucleotide Archive (ENA) at EMBL-EBI under accession number PRJEB40814 with the following BioSample numbers: SAMEA7465213 (sample DWDS A1), SAMEA7465214 (DWDS A2), SAMEA7465217 (DWDS B1), SAMEA7465218 (DWDS B2), SAMEA7465220 (DWDS C1), SAMEA7465221 (DWDS C2), SAMEA7465222 (DWDS D1), SAMEA7465223 (DWDS D2), SAMEA7465226 (DWDS E1), and SAMEA7465227 (DWDS E2). This dataset is associated with the following publication: Gomez-Alvarez, V., S. Siponen, A. Kauppinen, A. Hokajarvi , A. Tiwari, A. Sarekoski, I.T. Miettinen, E. Torvinen, and T. Pitkanen. A comparative analysis employing a gene- and genome-centric metagenomic approach reveals changes in composition, function, and activity in waterworks with different treatment processes and source water in Finland. WATER RESEARCH. Elsevier Science Ltd, New York, NY, USA, 229: 119495, (2023).
b
European Nucleotide Archive
bioregistry.io
Updated Feb 10, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). European Nucleotide Archive [Dataset]. http://identifiers.org/re3data:r3d100010527
Explore at:
Unique identifier
https://identifiers.org/re3data:r3d100010527
Dataset updated
Feb 10, 2022
Description
The European Nucleotide Archive (ENA) captures and presents information relating to experimental workflows that are based around nucleotide sequencing. ENA is made up of a number of distinct databases that includes EMBL-Bank, the Sequence Read Archive (SRA) and the Trace Archive each with their own data formats and standards. This collection references Embl-Bank identifiers.
d
European Nucleotide Archive (ENA)
dknet.org
rrid.site
+1more
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). European Nucleotide Archive (ENA) [Dataset]. http://identifiers.org/RRID:SCR_006515
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_006515
Dataset updated
Jan 29, 2022
Description
Public archive providing a comprehensive record of the world''''s nucleotide sequencing information, covering raw sequencing data, sequence assembly information and functional annotation. All submitted data, once public, will be exchanged with the NCBI and DDBJ as part of the INSDC data exchange agreement. The European Nucleotide Archive (ENA) captures and presents information relating to experimental workflows that are based around nucleotide sequencing. A typical workflow includes the isolation and preparation of material for sequencing, a run of a sequencing machine in which sequencing data are produced and a subsequent bioinformatic analysis pipeline. ENA records this information in a data model that covers input information (sample, experimental setup, machine configuration), output machine data (sequence traces, reads and quality scores) and interpreted information (assembly, mapping, functional annotation). Data arrive at ENA from a variety of sources including submissions of raw data, assembled sequences and annotation from small-scale sequencing efforts, data provision from the major European sequencing centers and routine and comprehensive exchange with their partners in the International Nucleotide Sequence Database Collaboration (INSDC). Provision of nucleotide sequence data to ENA or its INSDC partners has become a central and mandatory step in the dissemination of research findings to the scientific community. ENA works with publishers of scientific literature and funding bodies to ensure compliance with these principles and to provide optimal submission systems and data access tools that work seamlessly with the published literature. ENA is made up of a number of distinct databases that includes the EMBL Nucleotide Sequence Database (Embl-Bank), the newly established Sequence Read Archive (SRA) and the Trace Archive. The main tool for downloading ENA data is the ENA Browser, which is available through REST URLs for easy programmatic use. All ENA data are available through the ENA Browser. Note: EMBL Nucleotide Sequence Database (EMBL-Bank) is entirely included within this resource.
f
List of whole genome resequenced datasets available on Sequence Read...
datasetcatalog.nlm.nih.gov
figshare.com
Updated Jan 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Forcina, Giovanni; Sadanandan, Keren R.; Wu, Meng Yue; Low, Gabriel Weijie; Baldwin, Maude W.; Wu, Shaoyuan; Rheindt, Frank E.; van Grouw, Hein; Edwards, Scott V.; Gwee, Chyi Yin (2023). List of whole genome resequenced datasets available on Sequence Read Archive, European Nucleotide Archive or chickenSD for Gallus gallus and used in this study. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001056782
Explore at:
Dataset updated
Jan 19, 2023
Authors
Forcina, Giovanni; Sadanandan, Keren R.; Wu, Meng Yue; Low, Gabriel Weijie; Baldwin, Maude W.; Wu, Shaoyuan; Rheindt, Frank E.; van Grouw, Hein; Edwards, Scott V.; Gwee, Chyi Yin
Description
Local chickens that do not confer to a breed are labelled as “village”. (XLSX)
m
Modal
data.mendeley.com
Updated Feb 5, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Radu Constantin Parpala (2020). Modal [Dataset]. http://doi.org/10.17632/93j24wpf55.1
Explore at:
Unique identifier
https://doi.org/10.17632/93j24wpf55.1
Dataset updated
Feb 5, 2020
Authors
Radu Constantin Parpala
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Ansys archive file
The OHEJP BeONE Project – Escherichia coli genome assembly dataset
zenodo.org
data.niaid.nih.gov
+1more
bin, zip
Updated Jul 24, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Verónica Mixão; Verónica Mixão; Miguel Pinto; Miguel Pinto; João Paulo Gomes; João Paulo Gomes; Daniel Sobral; Daniel Sobral; Holger Brendebach; Holger Brendebach; Carlus Deneke; Carlus Deneke; Simon Tausch; Simon Tausch; Adriano Di Pasquale; Adriano Di Pasquale; Claudia Swart-Coipan; Claudia Swart-Coipan; Ewelina Iwan; Jörg Linde; Jörg Linde; Karin Lagesen; Karin Lagesen; Liljana Petrovska; Liljana Petrovska; Mohammed Umaer Naseer; Rolf Sommer Kaas; Rolf Sommer Kaas; Sandra Simon; Katrine Joensen; Katrine Joensen; Kristoffer Kiil; Sofie Nielsen; Sofie Nielsen; Vítor Borges; Vítor Borges; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI; Ewelina Iwan; Mohammed Umaer Naseer; Sandra Simon; Kristoffer Kiil; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI (2023). The OHEJP BeONE Project – Escherichia coli genome assembly dataset [Dataset]. http://doi.org/10.5281/zenodo.7802728
Explore at:
zip, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7802728
Dataset updated
Jul 24, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Verónica Mixão; Verónica Mixão; Miguel Pinto; Miguel Pinto; João Paulo Gomes; João Paulo Gomes; Daniel Sobral; Daniel Sobral; Holger Brendebach; Holger Brendebach; Carlus Deneke; Carlus Deneke; Simon Tausch; Simon Tausch; Adriano Di Pasquale; Adriano Di Pasquale; Claudia Swart-Coipan; Claudia Swart-Coipan; Ewelina Iwan; Jörg Linde; Jörg Linde; Karin Lagesen; Karin Lagesen; Liljana Petrovska; Liljana Petrovska; Mohammed Umaer Naseer; Rolf Sommer Kaas; Rolf Sommer Kaas; Sandra Simon; Katrine Joensen; Katrine Joensen; Kristoffer Kiil; Sofie Nielsen; Sofie Nielsen; Vítor Borges; Vítor Borges; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI; Ewelina Iwan; Mohammed Umaer Naseer; Sandra Simon; Kristoffer Kiil; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset

This dataset comprises the genome assemblies of 308 Escherichia coli samples collected by the BeONE Consortium on behalf of the One Health European Joint Programme “BeONE: Building Integrative Tools for One Health Surveillance” (https://onehealthejp.eu/jrp-beone/). Additionally, a complementary dataset is also made available (https://zenodo.org/record/7120057), comprising genome assemblies of 1,999 E. coli samples selected among the Whole-Genome Sequencing (WGS) data publicly available in the European Nucleotide Archive (ENA) or in the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA).

File “BeONE_Ec_metadata.xlsx” contains the genome assembly statistics for each isolate, including European Nucleotide Archive accession numbers, in-silico Multi Locus Sequence Type and Serotype, and information regarding year of sampling, country and source.

The archive “BeONE_Ec_assemblies.zip” contains all the genome assemblies (.fasta format) of each isolate presented in the metadata file.

Dataset selection and curation

This anonymized dataset of E. coli genome assemblies was generated using Next Generation Sequencing data collected within the BeONE Consortium available at the European Nucleotide Archive under BioProject Accession Number PRJEB57098. Read quality control, trimming and assembly were performed with Aquamis v1.3.9 (Deneke et al. 2021) using default parameters. Assembly quality control (QC), including contamination assessment, as well as MLST ST determination were performed with the same pipeline. All genome assemblies passing the QC were included in the final dataset. Among the others, we noticed that a considerable proportion of assemblies was flagged as “QC fail” exclusively due to the “NumContamSNVs” parameter, suggesting that this setting might have been too strict. After manual inspection of a random subset, assemblies for which the percentage of reads corresponding to the correct species was >98% were recovered and integrated in the final dataset (those samples are labeled in the Metadata file). In total, 308 isolates passed the dataset curation step and were included in the final dataset. In-silico serotyping was performed with seq_typing v2.2.

Funding

This work was supported by funding from the European Union’s Horizon 2020 Research and Innovation programme under grant agreement No 773830: One Health European Joint Programme.

Acknowledgements

We thank the National Distributed Computing Infrastructure of Portugal (INCD) for providing the necessary resources to run the genome assemblies. INCD was funded by FCT and FEDER under the project 22153-01/SAICT/2016.
r
Australian Nucleotide (DNA/RNA) and Protein sequences from Australian...
researchdata.edu.au
Updated Jun 4, 2014
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
QFAB Bioinformatics (2014). Australian Nucleotide (DNA/RNA) and Protein sequences from Australian organisms in the species Oedura gracilis () [Dataset]. https://researchdata.edu.au/australian-nucleotide-dnarna-oedura-gracilis/442223
Explore at:
Dataset updated
Jun 4, 2014
Dataset provided by
QFAB
Authors
QFAB Bioinformatics
Area covered
Australia
Description
This data collection contains all currently published nucleotide (DNA/RNA) and protein sequences from Australian Oedura gracilis. Other information about this group:

The nucleotide (DNA/RNA) and protein sequences have been sourced through the European Nucleotide Archive (ENA) and Universal Protein Resource (UniProt), databases that contains comprehensive sets of nucleotide (DNA/RNA) and protein sequences from all organisms that have been published by the International Research Community.

The identification of species in Oedura gracilis as Australian dwelling organisms has been achieved by accessing the Australian Plant Census (APC) or Australian Faunal Directory (AFD) through the Atlas of Living Australia.
of ExpressionPlot: a web-based framework for analysis of RNA-Seq and...
springernature.figshare.com
datasetcatalog.nlm.nih.gov
zip
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Brad Friedman; Tom Maniatis (2023). of ExpressionPlot: a web-based framework for analysis of RNA-Seq and microarray gene expression data [Dataset]. http://doi.org/10.6084/m9.figshare.10035245.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.10035245.v1
Dataset updated
May 31, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Brad Friedman; Tom Maniatis
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
of ExpressionPlot: a web-based framework for analysis of RNA-Seq and microarray gene expression data
The OHEJP BeONE Project – Listeria monocytogenes genome assembly dataset
zenodo.org
openagrar.de
+2more
bin, zip
Updated Jul 24, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Verónica Mixão; Verónica Mixão; Miguel Pinto; Miguel Pinto; João Paulo Gomes; João Paulo Gomes; Daniel Sobral; Daniel Sobral; Holger Brendebach; Holger Brendebach; Carlus Deneke; Carlus Deneke; Simon Tausch; Simon Tausch; Adriano Di Pasquale; Adriano Di Pasquale; Claudia Swart-Coipan; Claudia Swart-Coipan; Ewelina Iwan; Jörg Linde; Jörg Linde; Karin Lagesen; Karin Lagesen; Liljana Petrovska; Liljana Petrovska; Mohammed Umaer Naseer; Rolf Sommer Kaas; Rolf Sommer Kaas; Sandra Simon; Katrine Joensen; Katrine Joensen; Kristoffer Kiil; Sofie Nielsen; Sofie Nielsen; Vítor Borges; Vítor Borges; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI; Ewelina Iwan; Mohammed Umaer Naseer; Sandra Simon; Kristoffer Kiil; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI (2023). The OHEJP BeONE Project – Listeria monocytogenes genome assembly dataset [Dataset]. http://doi.org/10.5281/zenodo.7267487
Explore at:
bin, zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7267487
Dataset updated
Jul 24, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Verónica Mixão; Verónica Mixão; Miguel Pinto; Miguel Pinto; João Paulo Gomes; João Paulo Gomes; Daniel Sobral; Daniel Sobral; Holger Brendebach; Holger Brendebach; Carlus Deneke; Carlus Deneke; Simon Tausch; Simon Tausch; Adriano Di Pasquale; Adriano Di Pasquale; Claudia Swart-Coipan; Claudia Swart-Coipan; Ewelina Iwan; Jörg Linde; Jörg Linde; Karin Lagesen; Karin Lagesen; Liljana Petrovska; Liljana Petrovska; Mohammed Umaer Naseer; Rolf Sommer Kaas; Rolf Sommer Kaas; Sandra Simon; Katrine Joensen; Katrine Joensen; Kristoffer Kiil; Sofie Nielsen; Sofie Nielsen; Vítor Borges; Vítor Borges; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI; Ewelina Iwan; Mohammed Umaer Naseer; Sandra Simon; Kristoffer Kiil; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset

This dataset comprises the genome assemblies of 1,426 Listeria monocytogenes samples collected by the BeONE Consortium on behalf of the One Health European Joint Programme “BeONE: Building Integrative Tools for One Health Surveillance” (https://onehealthejp.eu/jrp-beone/). Additionally, a complementary dataset is also made available (https://zenodo.org/record/7116878), comprising genome assemblies of 1,874 L. monocytogenes samples selected among the Whole-Genome Sequencing (WGS) data publicly available in the European Nucleotide Archive (ENA) or in the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA).

File “BeONE_Lm_metadata.xlsx” contains the genome assembly statistics for each isolate, including European Nucleotide Archive accession numbers and in-silico Multi Locus Sequence Type.

The archive “BeONE_Lm_assemblies.zip” contains all the genome assemblies (.fasta format) of each isolate presented in the metadata file.

Dataset selection and curation

This anonymized dataset of L. monocytogenes genome assemblies was generated using Next Generation Sequencing data collected within the BeONE Consortium available at the European Nucleotide Archive under BioProject Accession Number PRJEB57166. Read quality control, trimming and assembly were performed with Aquamis v1.3.9 (Deneke et al. 2021) using default parameters. Assembly quality control (QC), including contamination assessment, as well as MLST ST determination were performed with the same pipeline. All genome assemblies passing the QC were included in the final dataset. Among the others, we noticed that a considerable proportion of assemblies was flagged as “QC fail” exclusively due to the “NumContamSNVs” parameter, suggesting that this setting might have been too strict. After manual inspection of a random subset, assemblies for which the percentage of reads corresponding to the correct species was >98% were recovered and integrated in the final dataset (those samples are labeled in the Metadata file). In total, 1,426 isolates passed the dataset curation step and were included in the final dataset.

Funding

This work was supported by funding from the European Union’s Horizon 2020 Research and Innovation programme under grant agreement No 773830: One Health European Joint Programme.
f
EukRef: Phylogenetic curation of ribosomal RNA to enhance understanding of...
figshare.com
datasetcatalog.nlm.nih.gov
xlsx
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Javier del Campo; Martin Kolisko; Vittorio Boscaro; Luciana F. Santoferrara; Serafim Nenarokov; Ramon Massana; Laure Guillou; Alastair Simpson; Cedric Berney; Colomban de Vargas; Matthew W. Brown; Patrick J. Keeling; Laura Wegener Parfrey (2023). EukRef: Phylogenetic curation of ribosomal RNA to enhance understanding of eukaryotic diversity and distribution [Dataset]. http://doi.org/10.1371/journal.pbio.2005849
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pbio.2005849
Dataset updated
May 31, 2023
Dataset provided by
PLOS Biology
Authors
Javier del Campo; Martin Kolisko; Vittorio Boscaro; Luciana F. Santoferrara; Serafim Nenarokov; Ramon Massana; Laure Guillou; Alastair Simpson; Cedric Berney; Colomban de Vargas; Matthew W. Brown; Patrick J. Keeling; Laura Wegener Parfrey
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Environmental sequencing has greatly expanded our knowledge of micro-eukaryotic diversity and ecology by revealing previously unknown lineages and their distribution. However, the value of these data is critically dependent on the quality of the reference databases used to assign an identity to environmental sequences. Existing databases contain errors and struggle to keep pace with rapidly changing eukaryotic taxonomy, the influx of novel diversity, and computational challenges related to assembling the high-quality alignments and trees needed for accurate characterization of lineage diversity. EukRef (eukref.org) is an ongoing community-driven initiative that addresses these challenges by bringing together taxonomists with expertise spanning the eukaryotic tree of life and microbial ecologists, who use environmental sequence data to develop reliable reference databases across the diversity of microbial eukaryotes. EukRef organizes and facilitates rigorous mining and annotation of sequence data by providing protocols, guidelines, and tools. The EukRef pipeline and tools allow users interested in a particular group of microbial eukaryotes to retrieve all sequences belonging to that group from International Nucleotide Sequence Database Collaboration (INSDC) (GenBank, the European Nucleotide Archive [ENA], or the DNA DataBank of Japan [DDBJ]), to place those sequences in a phylogenetic tree, and to curate taxonomic and environmental information for the group. We provide guidelines to facilitate the process and to standardize taxonomic annotations. The final outputs of this process are (1) a reference tree and alignment, (2) a reference sequence database, including taxonomic and environmental information, and (3) a list of putative chimeras and other artifactual sequences. These products will be useful for the broad community as they become publicly available (at eukref.org) and are shared with existing reference databases.
E
SUPERSEDED - Human, yeast and pig genomics: sequence submissions and first...
find.data.gov.scot
dtechtive.com
csv, txt
Updated Jun 6, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Science, Technology and Innovation Studies. University of Edinburgh (2018). SUPERSEDED - Human, yeast and pig genomics: sequence submissions and first sequence descriptions in the literature (1980-2015) [Dataset]. http://doi.org/10.7488/ds/2358
Explore at:
csv(0.1938 MB), csv(11.83 MB), csv(0.8656 MB), csv(0.056 MB), csv(0.5938 MB), csv(1.918 MB), txt(0.0166 MB)Available download formats
Unique identifier
https://doi.org/10.7488/ds/2358
Dataset updated
Jun 6, 2018
Dataset provided by
Science, Technology and Innovation Studies. University of Edinburgh
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This item has been replaced by the one which can be found at https://doi.org/10.7488/ds/2589 ## This data collection is derived from two sources: 1) Submissions of DNA sequences of S. cerevisiae (yeast), Sus scrofa (pig) and Homo sapiens (human) to the European Nucleotide Archive, and 2) First description of these sequences in the scientific literature. The time range of the records is 1980-2000 (yeast), 1985-2005 (human) and 1990-2015 (pig). In total, each species has two associated datasets: 1) A .csv file documenting the PubMed ID of each article describing new sequences, all paper authors, all institutional affiliations of each author, country of institution, year of first submission to the European Nucleotide Archive, and the year of article publication, and 2) A .csv file documenting all institutions submitting to the European Nucleotide Archive, number of nucleotides sequenced, number of submissions per institution, and year of submission to the database. The approximate number of records is 28,000 publications and over 2 million sequence submissions. Some data about submitting institutions is not fully cleaned.
The OHEJP BeONE Project – Salmonella enterica genome assembly dataset
data.europa.eu
data.niaid.nih.gov
+1more
unknown
Updated Jul 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zenodo (2025). The OHEJP BeONE Project – Salmonella enterica genome assembly dataset [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-7802723?locale=pl
Explore at:
unknown(189629)Available download formats
Dataset updated
Jul 3, 2025
Dataset authored and provided by
Zenodohttp://zenodo.org/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset This dataset comprises the genome assemblies of 1,540 Salmonella enterica samples collected by the BeONE Consortium on behalf of the One Health European Joint Programme “BeONE: Building Integrative Tools for One Health Surveillance” (https://onehealthejp.eu/jrp-beone/). Additionally, a complementary dataset is also made available (https://zenodo.org/record/7119735), comprising genome assemblies of 1,434 S. enterica samples selected among the Whole-Genome Sequencing (WGS) data publicly available in the European Nucleotide Archive (ENA) or in the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA). File “BeONE_Se_metadata.xlsx” contains the genome assembly statistics for each isolate, including European Nucleotide Archive accession numbers, in-silico Multi Locus Sequence Type and Serotype, and information regarding year of sampling, country and source. The archive “BeONE_Se_assemblies.zip” contains all the genome assemblies (.fasta format) of each isolate presented in the metadata file. Dataset selection and curation This anonymized dataset of S. enterica genome assemblies was generated using Next Generation Sequencing data collected within the BeONE Consortium available at the European Nucleotide Archive under BioProject Accession Number PRJEB57179. Read quality control, trimming and assembly were performed with Aquamis v1.3.9 (Deneke et al. 2021) using default parameters. Assembly quality control (QC), including contamination assessment, as well as MLST ST determination were performed with the same pipeline. All genome assemblies passing the QC were included in the final dataset. Among the others, we noticed that a considerable proportion of assemblies was flagged as “QC fail” exclusively due to the “NumContamSNVs” parameter, suggesting that this setting might have been too strict. After manual inspection of a random subset, assemblies for which the percentage of reads corresponding to the correct species was >98% were recovered and integrated in the final dataset (those samples are labeled in the Metadata file). In total, 1,540 isolates passed the dataset curation step and were included in the final dataset. In-silico serotyping was performed with SeqSero2 v1.2.1 (Zhang et al. 2019). Funding This work was supported by funding from the European Union’s Horizon 2020 Research and Innovation programme under grant agreement No 773830: One Health European Joint Programme. Acknowledgements We thank the National Distributed Computing Infrastructure of Portugal (INCD) for providing the necessary resources to run the genome assemblies. INCD was funded by FCT and FEDER under the project 22153-01/SAICT/2016.
r
Nucleotide (DNA / RNA) and Protein sequences from the Australian research...
researchdata.edu.au
Updated Jul 20, 2012
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
QFAB Bioinformatics (2012). Nucleotide (DNA / RNA) and Protein sequences from the Australian research institution University of Notre Dame [Dataset]. https://researchdata.edu.au/nucleotide-dna-rna-notre-dame/56616
Explore at:
Dataset updated
Jul 20, 2012
Dataset provided by
QFAB
Authors
QFAB Bioinformatics
Description
This data collection contains all currently published nucleotide (DNA/RNA) and protein sequences from the Australian research institution University of Notre Dame.The nucleotide (DNA/RNA) and protein sequences have been sourced through the European Nucleotide Archive (ENA) and Universal Protein Resource (UniProt), databases that contains comprehensive sets of nucleotide (DNA/RNA) and protein sequences from all organisms that have been published by the International Research Community.
MOESM4 of ExpressionPlot: a web-based framework for analysis of RNA-Seq and...
springernature.figshare.com
datasetcatalog.nlm.nih.gov
zip
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Brad Friedman; Tom Maniatis (2023). MOESM4 of ExpressionPlot: a web-based framework for analysis of RNA-Seq and microarray gene expression data [Dataset]. http://doi.org/10.6084/m9.figshare.10035281.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.10035281.v1
Dataset updated
Jun 1, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Brad Friedman; Tom Maniatis
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Additional file 4: Archival copy of software. (ZIP 3 MB)
f
Overview of the bioinformatic steps involved in submitting novel DNA...
plos.figshare.com
xls
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Michael Gruenstaeudl; Yannick Hartmaring (2023). Overview of the bioinformatic steps involved in submitting novel DNA sequence data to ENA when using EMBL2checklists, starting from assembled DNA sequences. [Dataset]. http://doi.org/10.1371/journal.pone.0210347.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0210347.t001
Dataset updated
May 30, 2023
Dataset provided by
PLOS ONE
Authors
Michael Gruenstaeudl; Yannick Hartmaring
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Overview of the bioinformatic steps involved in submitting novel DNA sequence data to ENA when using EMBL2checklists, starting from assembled DNA sequences.
n
BioSample Database at EBI
neuinfo.org
scicrunch.org
+2more
Updated Jan 29, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). BioSample Database at EBI [Dataset]. http://identifiers.org/RRID:SCR_004856
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_004856
Dataset updated
Jan 29, 2022
Description
Database that aggregates sample information for reference samples (e.g. Coriell Cell lines) and samples for which data exist in one of the EBI''''s assay databases such as ArrayExpress, the European Nucleotide Archive or PRoteomics Identificates DatabasE. It provides links to assays for specific samples, and accepts direct submissions of sample information. The goals of the BioSample Database include: # recording and linking of sample information consistently within EBI databases such as ENA, ArrayExpress and PRIDE; # minimizing data entry efforts for EBI database submitters by enabling submitting sample descriptions once and referencing them later in data submissions to assay databases and # supporting cross database queries by sample characteristics. The database includes a growing set of reference samples, such as cell lines, which are repeatedly used in experiments and can be easily referenced from any database by their accession numbers. Accession numbers for the reference samples will be exchanged with a similar database at NCBI. The samples in the database can be queried by their attributes, such as sample types, disease names or sample providers. A simple tab-delimited format facilitates submissions of sample information to the database, initially via email to biosamples (at) ebi.ac.uk. Current data sources: * European Nucleotide Archive (424,811 samples) * PRIDE (17,001 samples) * ArrayExpress (1,187,884 samples) * ENCODE cell lines (119 samples) * CORIELL cell lines (27,002 samples) * Thousand Genome (2,628 samples) * HapMap (1,417 samples) * IMSR (248,660 samples)
The OHEJP BeONE Project – Campylobacter jejuni genome assembly dataset
data.europa.eu
data.niaid.nih.gov
+1more
unknown
Updated May 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zenodo (2024). The OHEJP BeONE Project – Campylobacter jejuni genome assembly dataset [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-7802717?locale=fr
Explore at:
unknown(74818)Available download formats
Dataset updated
May 16, 2024
Dataset authored and provided by
Zenodohttp://zenodo.org/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset This dataset comprises the genome assemblies of 610 Campylobacter jejuni samples collected by the BeONE Consortium on behalf of the One Health European Joint Programme “BeONE: Building Integrative Tools for One Health Surveillance” (https://onehealthejp.eu/jrp-beone/). Additionally, a complementary dataset is also made available (https://zenodo.org/record/7120166), comprising genome assemblies of 3,076 C. jejuni samples selected among the Whole-Genome Sequencing (WGS) data publicly available in the European Nucleotide Archive (ENA) or in the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA). File “BeONE_Cj_metadata.xlsx” contains the genome assembly statistics for each isolate, including European Nucleotide Archive accession numbers and in-silico Multi Locus Sequence Type, and information regarding year of sampling, country and source. The archive “BeONE_Cj_assemblies.zip” contains all the genome assemblies (.fasta format) of each isolate presented in the metadata file. Dataset selection and curation This anonymized dataset of C. jejuni genome assemblies was generated using Next Generation Sequencing data collected within the BeONE Consortium available at the European Nucleotide Archive under BioProject Accession Number PRJEB57119. Read quality control, trimming and assembly were performed with Aquamis v1.3.9 (Deneke et al. 2021) using default parameters. Assembly quality control (QC), including contamination assessment, as well as MLST ST determination were performed with the same pipeline. All genome assemblies passing the QC were included in the final dataset. Among the others, we noticed that a considerable proportion of assemblies was flagged as “QC fail” exclusively due to the “NumContamSNVs” parameter, suggesting that this setting might have been too strict. After manual inspection of a random subset, assemblies for which the percentage of reads corresponding to the correct species was >98% were recovered and integrated in the final dataset (those samples are labeled in the Metadata file). In total, 610 isolates passed the dataset curation step and were included in the final dataset. Funding This work was supported by funding from the European Union’s Horizon 2020 Research and Innovation programme under grant agreement No 773830: One Health European Joint Programme. Acknowledgements We thank the National Distributed Computing Infrastructure of Portugal (INCD) for providing the necessary resources to run the genome assemblies. INCD was funded by FCT and FEDER under the project 22153-01/SAICT/2016.
r
Nucleotide (DNA / RNA) and Protein sequences from the Australian research...
researchdata.edu.au
Updated Jul 20, 2012
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
QFAB Bioinformatics (2012). Nucleotide (DNA / RNA) and Protein sequences from the Australian research institution Western Health [Dataset]. https://researchdata.edu.au/nucleotide-dna-rna-western-health/53125
Explore at:
Dataset updated
Jul 20, 2012
Dataset provided by
QFAB
Authors
QFAB Bioinformatics
Area covered
Australia
Description
This data collection contains all currently published nucleotide (DNA/RNA) and protein sequences from the Australian research institution Western Health.The nucleotide (DNA/RNA) and protein sequences have been sourced through the European Nucleotide Archive (ENA) and Universal Protein Resource (UniProt), databases that contains comprehensive sets of nucleotide (DNA/RNA) and protein sequences from all organisms that have been published by the International Research Community.
MOESM3 of ExpressionPlot: a web-based framework for analysis of RNA-Seq and...
springernature.figshare.com
zip
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Brad Friedman; Tom Maniatis (2023). MOESM3 of ExpressionPlot: a web-based framework for analysis of RNA-Seq and microarray gene expression data [Dataset]. http://doi.org/10.6084/m9.figshare.10035278.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.10035278.v1
Dataset updated
May 31, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Brad Friedman; Tom Maniatis
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Additional file 3: Data for Figure 2d. (ZIP 6 MB)
r
Nucleotide (DNA / RNA) and Protein sequences from the Australian dwelling...
researchdata.edu.au
Updated Jul 20, 2012
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
QFAB Bioinformatics (2012). Nucleotide (DNA / RNA) and Protein sequences from the Australian dwelling species Pityrodia uncinata [Dataset]. https://researchdata.edu.au/nucleotide-dna-rna-pityrodia-uncinata/48964
Explore at:
Dataset updated
Jul 20, 2012
Dataset provided by
QFAB
Authors
QFAB Bioinformatics
Description
This data collection contains all currently published nucleotide (DNA/RNA) and protein sequences from the Australian dwelling organism Pityrodia uncinata.

The nucleotide (DNA/RNA) and protein sequences have been sourced through the European Nucleotide Archive (ENA) and Universal Protein Resource (UniProt), databases that contains comprehensive sets of nucleotide (DNA/RNA) and protein sequences from all organisms that have been published by the International Research Community.

The identification of the species Pityrodia uncinata as an Australian dwelling organism has been achieved by accessing the Australian Plant Census (APC) or Australian Faunal Directory (AFD) through the Atlas of Living Australia.

Facebook

Twitter

Click to copy link

Link copied

Cite

U.S. EPA Office of Research and Development (ORD) (2023). The metagenome sequencing data have been deposited in the European Nucleotide Archive (ENA). [Dataset]. https://catalog.data.gov/dataset/the-metagenome-sequencing-data-have-been-deposited-in-the-european-nucleotide-archive-ena-8ad17

The metagenome sequencing data have been deposited in the European Nucleotide Archive (ENA).

Explore at:

Dataset updated

Mar 11, 2023

Dataset provided by

United States Environmental Protection Agencyhttp://www.epa.gov/

Description

The raw sequencing data for this study have been deposited in the European Nucleotide Archive (ENA) at EMBL-EBI under accession number PRJEB40814 with the following BioSample numbers: SAMEA7465213 (sample DWDS A1), SAMEA7465214 (DWDS A2), SAMEA7465217 (DWDS B1), SAMEA7465218 (DWDS B2), SAMEA7465220 (DWDS C1), SAMEA7465221 (DWDS C2), SAMEA7465222 (DWDS D1), SAMEA7465223 (DWDS D2), SAMEA7465226 (DWDS E1), and SAMEA7465227 (DWDS E2). This dataset is associated with the following publication: Gomez-Alvarez, V., S. Siponen, A. Kauppinen, A. Hokajarvi , A. Tiwari, A. Sarekoski, I.T. Miettinen, E. Torvinen, and T. Pitkanen. A comparative analysis employing a gene- and genome-centric metagenomic approach reveals changes in composition, function, and activity in waterworks with different treatment processes and source water in Finland. WATER RESEARCH. Elsevier Science Ltd, New York, NY, USA, 229: 119495, (2023).

Clear search

Close search

Google apps

Main menu

The metagenome sequencing data have been deposited in the European...

European Nucleotide Archive

European Nucleotide Archive (ENA)

List of whole genome resequenced datasets available on Sequence Read...

Modal

The OHEJP BeONE Project – Escherichia coli genome assembly dataset

Australian Nucleotide (DNA/RNA) and Protein sequences from Australian...

of ExpressionPlot: a web-based framework for analysis of RNA-Seq and...

The OHEJP BeONE Project – Listeria monocytogenes genome assembly dataset

EukRef: Phylogenetic curation of ribosomal RNA to enhance understanding of...

SUPERSEDED - Human, yeast and pig genomics: sequence submissions and first...

The OHEJP BeONE Project – Salmonella enterica genome assembly dataset

Nucleotide (DNA / RNA) and Protein sequences from the Australian research...

MOESM4 of ExpressionPlot: a web-based framework for analysis of RNA-Seq and...

Overview of the bioinformatic steps involved in submitting novel DNA...

BioSample Database at EBI

The OHEJP BeONE Project – Campylobacter jejuni genome assembly dataset

Nucleotide (DNA / RNA) and Protein sequences from the Australian research...

MOESM3 of ExpressionPlot: a web-based framework for analysis of RNA-Seq and...

Nucleotide (DNA / RNA) and Protein sequences from the Australian dwelling...

The metagenome sequencing data have been deposited in the European Nucleotide Archive (ENA).See More Versions

The metagenome sequencing data have been deposited in the European Nucleotide Archive (ENA).