Facebook
Twitterhttps://www.neonscience.org/data-samples/data-policies-citationhttps://www.neonscience.org/data-samples/data-policies-citation
COI DNA sequences from select fish in lakes and wadeable streams
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
http://www.boldsystems.org/. The Barcode of Life Data Systems (BOLD), is a web platform that provides an integrated environment for the assembly and use of DNA barcode data. It delivers an online database for collection and management of specimen, distributional, and molecular data as well as analytical tools to support their validation. Over the past few years, BOLD has grown to become a powerful online workbench and the central informatics hub of the DNA barcoding community. BOLD is freely available to any researcher with interests in DNA Barcoding. By providing specialized services, it aids in the publication of records that meet the standards needed to gain BARCODE designation in the global sequence databases. Because of its web-based delivery and flexible data security model, it is also well positioned to support projects that involve broad research alliances.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Museomics is an approach to the DNA sequencing of museum specimens that can generate both biodiversity and sequence information. In this study, we surveyed both the biodiversity information-based database BOLD (Barcode of Life System) and the sequence information database GenBank, by using DNA barcoding data as an example, with the aim of integrating the data from these two databases. DNA barcoding is a method of identifying species from DNA sequences by using short genetic markers. We surveyed how many entries had biodiversity information (such as links to BOLD and specimen IDs) by downloading all fish, insect, and flowering plant data available from the GenBank Nucleotide, and BOLD ID was assigned to 26.2% of entries for insects. In the same way, we downloaded the respective BOLD data and checked the status of links to sequence information. We also investigated how many species do these databases cover, and 7,693 species were found to exist only in BOLD. In the future, as museomics develops as a field, the targeted sequences will be extended not only to DNA barcodes, but also to mitochondrial genomes, other genes, and genome sequences. Consequently, the value of the sequence data will increase. In addition, various species will be sequenced and, thus, biodiversity information such as the evidence specimen photographs used as a basis for species identification, will become even more indispensable. This study contributes to the acceleration of museomics-associated research by using databases in a cross-sectional manner.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Building DNA barcode databases for plants has historically been ad hoc, and often with a relatively narrow taxonomic focus. To realize the full potential of DNA barcoding for plants, and particularly its application to metabarcoding for mixed-species environmental samples, systematic sequencing of reference collections is required using an augmented set of DNA barcode loci, applied according to agreed data generation and analysis standards. The largest and most complete reference collections of plants are held in herbaria. Australia has a globally significant flora that is well sampled and expertly curated by its herbaria, coordinated through the Council of Heads of Australasian Herbaria. There exists a tremendous opportunity to provide a comprehensive and taxonomically robust reference database for plant DNA barcoding applications by undertaking coordinated and systematic sequencing of the entire flora of Australia utilizing existing herbarium material. In this paper, we review the development of DNA barcoding and metabarcoding and consider the requirements for a robust and comprehensive system. We analyzed the current availability of DNA barcode reference data for Australian plants, recommend priority taxa for database inclusion, and highlight future applications of a comprehensive metabarcoding system. We urge that large-scale and coordinated analysis of herbarium collections be undertaken to realize the promise of DNA barcoding and metabarcoding, and propose that the generation and curation of reference data should become a national investment priority.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset contains 438 records of Trichoptera species from 22 of the 23 families known from the Iberian Peninsula. Specimens were collected between 1975 to 2018 in Portugal, Spain and France (Paupério et al., 2023). Specimens have been identified to species or subspecies level, in a total of 141 species representing 37% of the Caddisflies known for the Iberian Peninsula. Specimens were captured during fieldwork directed specifically for the sampling of Trichoptera using different methodologies and stored in 96% ethanol. A tissue sample, usually a leg, was collected from each individual, from which DNA was extracted. Sequencing of the 658 bp COI DNA barcode was conducted within the InBIO Barcoding Initiative (IBI) and all DNA sequences were submitted to BOLD (Barcode of Life Data System) and GenBank databases. Specimens are deposited in the IBI collection at the CIBIO (Research Center in Biodiversity and Genetic Resources, Portugal) or in the collection Marcos A. González at the University of Santiago de Compostela (Spain). All DNA extracts are deposited in the IBI collection.
Facebook
TwitterDNA barcoding is a method of identifying individual organisms using short DNA fragments matched to a database of reference sequences. For metazoan plankton, a high proportion of species that reside in the deep ocean still lack reliable reference sequences for genetic markers for barcoding and systematics. We report on substantial taxonomic and barcoding efforts across major zooplankton taxonomic groups collected from surface waters to the rarely sampled abyssopelagic zone (0 – 4300 m) from the Gulf of Alaska, North Pacific Ocean. Over 1000 specimens were identified, from which the mitochondrial 16S and COI and nuclear 18S rRNA genes were sequenced. In total, 1462 sequences for 254 unique taxa were generated, adding new barcodes for 107 species, including 12 undescribed species of cnidarians, that previously lacked DNA sequences for at least one of the three genes. Additionally, we introduce the use of a new Open Nomenclature qualifier deoxyribonucleic acid abbreviation DNA (e.g., Genus DNA species, DNA Genus). This qualifier was used for specimens that could not be morphologically identified but could be assigned a low-level taxonomic identification based on the clustering of DNA barcode genes using phylogenetic trees (100% bootstrap support), where at least one of the sequences in that clade could be referred to a physical specimen (or photographs) where identification could be corroborated through morphological analyses. DNA barcodes from this work are incorporated into the MetaZooGene Atlas and Database, an open-access data and metadata portal for barcoding genes used for classifying and identifying marine organisms. As environmental sequencing (i.e., metabarcoding, metagenetics, and eDNA) becomes an increasingly common approach in marine ecosystem studies, continued population of such reference DNA sequence databases must remain a high priority.
Facebook
TwitterThis data product contains the quality-controlled laboratory metadata and QA results for NEON's cytochrome oxidase I (COI) barcoding of fish sequences. Fin clips are taken from a subset of collected fish for DNA analysis. The DNA barcoding procedure involves the removal of tissue, extracting and sequencing DNA from the tissue, and matching that sequence data to sequences from previously identified voucher specimens. DNA analysis serves a number of purposes, including verification of taxonomy of specimens that do not receive expert identification, clarification of the taxonomy of rare or cryptic species, and characterization of diversity using molecular markers. For additional details, see the user guide, protocols, and science design listed in the Documentation section in this data product's details webpage. Queries for this data product will return metadata tables formatted for submission to the Barcode of Life Database. These queries will also provide links to the actual sequence data, which are publicly available on the Barcode of Life Datasystem (BOLD, http://www.barcodinglife.com/). The sequence data can be obtained by following the links from the NEON data portal, or by directly querying NEON data sets on the BOLD server. From the NEON portal, the link "BOLD Project: Fish sequences DNA barcode" redirects to a page on the BOLD public data portal for the queried data. This is a dynamic link and will automatically update based on the user query. Latency: The expected time from data and/or sample collection in the field to data publication is as follows, for each of the data tables (in days) in the downloaded data package. See the Data Product User Guide for more information. fsh_BOLDcollectionData: 390 fsh_BOLDspecimenDetails: 390 fsh_BOLDtaxonomy: 390 fsh_BOLDvoucherInfo: 390 Fin clips will be collected from 5-10 individuals of a target species. These tissues will be preserved in an appropriate tissue vial and shipped to an external lab. DNA will be extracted and target sequences amplified via PCR. Barcodes of cytochrome oxidase I will be generated per specimen.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset contains 71 records of Plecoptera specimens collected from 2004 to 2018 in the Iberian Peninsula (Ferreira et al., 2020). Twenty-nine stoneflies species are represented in the dataset, contributing to the knowledge on the DNA barcodes and distribution of the Plecoptera in Iberia. Specimens were captured during fieldwork directed specifically for the sampling of Plecoptera using different methodologies and stored in 96% ethanol. All specimens were morphologically identified to species level. A tissue sample, usually a leg, was collected from each individual, from which DNA was extracted. The DNA barcoding of these specimens was conducted within the InBIO Barcoding Initiative (IBI), funded by EnvMetaGen and PORBIOTA projects. DNA barcode sequences were deposited in BOLD (Barcode of Life Data System) online database. Preserved specimens and DNA extracts are deposited in the IBI collection at the CIBIO (Research Center in Biodiversity and Genetic Resources).
Facebook
Twitterhttps://www.neonscience.org/data-samples/data-policies-citationhttps://www.neonscience.org/data-samples/data-policies-citation
COI DNA sequences from select small mammals
Facebook
TwitterDemosponge classification is notoriously challenged by the paucity of informative morphological characters with sufficient complexity to discriminate between apomorphies and convergences. Molecular data, preferably from type material, helps shed light on phylogenetic relationships. In the following, we review, based on the results of DNA barcoding of type (and other) material, the classification of several demosponge species and genera with either eminent or previously uncertain classification. We report that the aster-bearing genus Leptosastra Topsent, 1904, is a poecilosclerid, unlike Clathria faviformis Lehnert & van Soest, 1996, which should be classified as Raspailiidae. The genus transfers of Eurypon laughlini Díaz, Alvarez & van Soest, 1987 to Prosuberites Topsent, and Leucophloeus lewisi Van Soest & Stentoft, 1988, to Axinyssa Lendenfeld are supported, unlike the transfer of Halichondria almae (Carballo, Uriz & García-Gómez, 1996) from Ciocalapata de Laubenfels. The new sequences are the first to be published in the new version of the Sponge Barcoding Database (SBDv2) of the Sponge Barcoding Project (www.spongebarcoding.org). Our findings underline the benefits of sequencing historic reference material, even if it is centuries old, and emphasises that type material should always be considered in answering systematic questions, particularly with challenging taxa such as sponges. Zootaxa, in revision
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains 917 specimen records of 179 chironomid species between 1989 and 2015 in Japan, which are based on the Chironomid DNA Barcode Database published by National Institute for Environmental Studies, Japan (NIES). The Chironomid DNA Barcode Database can be found at https://www.nies.go.jp/yusurika/en/index.html.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Molecular tools are increasingly being used to survey the presence of biodiversity and their interactions within ecosystems. Indirect methods, like environmental DNA (eDNA) and invertebrate-derived DNA (iDNA), are dependent on sequence databases with accurate and sufficient taxonomic representation. These methods are increasingly being used in regions and habitats where direct detection or observations can be difficult for a variety of reasons. Madagascar is a biodiversity hotspot with a high proportion of endemic species, many of which are threatened or endangered. Here we describe a new resource, VoronaGasyCodes, a curated database of newly published genetic sequences from Malagasy birds. Our database is currently populated with six mitochondrial genes or DNA barcodes for 142 species including 70% of the birds endemic to the island and will be periodically updated as new data become available. We demonstrate the utility of our database with an iDNA study of leech blood meals where we successfully identified 77% of the hosts to species. These types of resources for characterizing biodiversity are critical for insights into species distribution, discovery of new taxa, novel ecological connections, and advancing conservation and restoration measures.
Facebook
Twitterhttps://www.neonscience.org/data-samples/data-policies-citationhttps://www.neonscience.org/data-samples/data-policies-citation
COI DNA sequences from select mosquitoes
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The capacity to identify an unknown organism using the DNA sequence from a single gene has many applications. These include the development of biodiversity inventories (Janzen et al. 2005), forensics (Meiklejohn et al. 2011), biosecurity (Armstrong and Ball 2005), and the identification of cryptic species (Smith et al. 2006). The popularity and widespread use (Teletchea 2010) of the DNA barcoding approach (Hebert et al. 2003), despite broad misgivings (e.g., Smith 2005; Will et al. 2005; Rubinoff et al. 2006), attest to this. However, one major shortcoming to the standard barcoding approach is that it assumes that gene trees and species trees are synonymous, an assumption that is known not to hold in many cases (Pamilo and Nei 1988; Funk and Omland 2003). Biological processes that violate this assumption include incomplete lineage sorting and interspecific hybridization (Funk and Omland 2003). Indeed, simulation studies indicate that the concatenation approach (in which these two processes are ignored) can lead to statistically inconsistent estimation of the species tree (Kubatko and Degnan 2007). However, recent developments make a barcoding approach that utilizes a single locus outdated. The cost of sequencing multiple gene fragments is no longer inhibitory, but more importantly, a range of analytical approaches have been developed that account for incomplete lineage sorting (Degnan and Salter 2005; Edwards et al. 2007; Liu et al. 2008; Kubatko et al. 2009; Heled and Drummond 2010; Yang and Rannala 2010). These approaches incorporate coalescent theory into the analysis of species trees and species delimitation (Fujita et al. 2012) and are conveniently accessible as software programs (e.g., BEST, BPP, *BEAST, MrBayes v. 3.2, STEM, and COAL). Although the general mixed Yule coalescent (GMYC) approach has also been developed for species delimitation (Pons et al. 2006), we do not consider it further here. It operates quite differently to the approaches outlined above (i.e., BEST, BPP, *BEAST, MrBayes v. 3.2, STEM, and COAL). The GMYC approach seeks to identify the shift in the rate of lineage branching that should be evident when interspecific evolutionary processes switch to population-level processes (Pons et al. 2006). Both empirical (Esselstyn et al. 2012) and simulation studies (Esselstyn et al. 2012; Fujisawa and Barraclough 2013) report that it performs poorly when effective population sizes and speciation rates are high, but within biologically relevant ranges. Ideally, a "next-generation" barcoding approach would (1) identify a minimal set of barcoding genes (perhaps specific to certain lineages), (2) generate a large and cladistically divergent database for comparisons, and (3) identify species using species delimitation approaches that incorporate the multispecies coalescent. The first two of these conditions are straightforward and require only discussion (requirement 1) and resources (requirement 2). However, the third requirement is much more problematic. Some of the recently developed approaches for species delimitation could not be used alone; for example, BPP requires a user-specified guide tree (Yang and Rannala 2010). All of the recently developed approaches are computationally intensive (Degnan and Rosenberg 2009), with many having practical limitations on the number of individuals that can be compared. By contrast, the current barcoding approach is able to compare enormous numbers of sequences in a very short time, primarily because the approach is analytically simple; a single sequence is compared with all sequences in the database by calculating all possible pairwise K2P distances. As long as exemplars exist within the database that have K2P distances below some predetermined threshold (usually 4%), the species is considered identified. The speed of analysis is due primarily to the use of distance-based measures. The purpose of this article is to initiate the development of a framework for "next-gen barcoding": one that incorporates the multispecies coalescent, but does so by comparing multiple gene sequences from an unknown taxon with a database of sequences.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Established in 2008, the International Barcode of Life Consortium (iBOL, http://www.ibol.org/) is a research alliance of nations with the desire to transform biodiversity science by building the DNA barcode reference libraries, the sequencing facilities, the informatics platforms, the analytical protocols, and the international collaboration required to inventory and assess biodiversity.
iBOL has overseen the completion of one major program, BARCODE 500K, and a second program, BIOSCAN runs for seven years from June 2019. The first program barcoded 500,000 species reflecting the investment of $150 million by research organizations in 25 nations. Building on this success, BIOSCAN will extend barcode coverage to 2.5 million species by 2025. This program will stimulate activation of the Planetary Biodiversity Mission (PBM) – iBOL’s final project. PBM is a research initiative that will deliver a comprehensive understanding of the composition and distribution of multi-cellular life by 2045.
iBOL maintains the Barcode of Life Data System (BOLD, http://www.boldsystems.org/). BOLD is a cloud-based data storage and analysis platform developed at the Centre for Biodiversity Genomics in Canada. It consists of four main modules, a data portal, an educational portal, a registry of BINs (putative species), and a data collection and analysis workbench.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset contains five records of the alderfly genus Sialis Latreille, 1803 (Megaloptera, Sialidae) collected in 2011 and 2015 in northern continental Portugal. The study of these specimens found two species previously unknown in the country, S. lutaria Linnaeus, 1758 and S. nigripes Pictet, 1865 and confirmed the presence of S. fuliginosa Pictet, 1836 in Portugal (Ferreira et al. 2019). In that work the three species were identified morphologically and confirmed with DNA barcodes. Specimens were detected by direct search on vegetation and rocks around river streams and captured with a hand-net. Captured specimens were identified to species level and preserved in 96% ethanol. A tissue sample, a leg, was collected from each individual, from which DNA was extracted. The DNA barcoding of these specimens was conducted within the InBIO Barcoding Initiative (IBI), funded by EnvMetaGen and PORBIOTA projects. DNA barcode sequences were deposited in BOLD (Barcode of Life Data System) and GenBank online databases. DNA extracts are deposited in the IBI collection at the CIBIO (Research Center in Biodiversity and Genetic Resources).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Supplementary data for Erin R. Fenneman's MSc Thesis (University of British Columbia), DNA Barcoding the Vascular Plant Flora of Southern BC. Includes sequence data for rbcL and matK DNA barcode markers for 1511 samples, representing 907 vascular plant species. Samples represent both field-collected tissue and UBC herbarium tissue samples. Sequencing was completed by the Canadian Centre for DNA Barcoding (CCDB). Also includes an additional 1286 rbcL and 906 matK sequences downloaded from the Barcode of Life Database (BOLD) and GenBank.Fileset includes Nexus files for sequence alignments, Neighbour Joining trees, Bootstrap Consensus trees, and output files for BLASTn searches using a local custom database.
Facebook
TwitterThe dataset contains 234 records of Lacewings (Neuroptera) species collected from 2006 to 2019 in continental Portugal (Oliveira et al., 2021). Specimens were detected and captured by direct search of the environment and by using both UV and mercury-vapor lamps to attract the insects. Captured specimens were preserved in 96% ethanol. All captured specimens were identified to species level. Samples of each species were selected for DNA sequencing based on their geographic provenance. From each specimen, one leg (tissue sample) was removed to be used for DNA extraction. The DNA barcoding of these specimens was conducted within the InBIO Barcoding Initiative (IBI), funded by EnvMetaGen and PORBIOTA projects. DNA barcode sequences were deposited in BOLD (Barcode of Life Data System) and GenBank databases. All specimens and DNA extracts are deposited in the IBI collection at the CIBIO (Research Center in Biodiversity and Genetic Resources).
Facebook
Twitterhttps://www.neonscience.org/data-samples/data-policies-citationhttps://www.neonscience.org/data-samples/data-policies-citation
COI DNA sequences from select ground beetles
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
A Viridiplantae ITS2 reference database created using BCdatabaser with the following parameters:
This database was curated using a curation pipeline workflow available on GitHub.
If you use this dataset, please cite Quaresma et al. 2024, Scientific Data, DOI: 10.1038/s41597-024-02962-5
Facebook
Twitterhttps://www.neonscience.org/data-samples/data-policies-citationhttps://www.neonscience.org/data-samples/data-policies-citation
COI DNA sequences from select fish in lakes and wadeable streams