100+ datasets found
  1. n

    GenBank

    • neuinfo.org
    • rrid.site
    • +1more
    Updated Sep 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). GenBank [Dataset]. http://identifiers.org/RRID:SCR_002760
    Explore at:
    Dataset updated
    Sep 17, 2024
    Description

    NIH genetic sequence database that provides annotated collection of all publicly available DNA sequences for almost 280 000 formally described species (Jan 2014) .These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole-genome shotgun (WGS) and environmental sampling projects. Most submissions are made using web-based BankIt or standalone Sequin programs, and GenBank staff assigns accession numbers upon data receipt. It is part of International Nucleotide Sequence Database Collaboration and daily data exchange with European Nucleotide Archive (ENA) and DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through NCBI Entrez retrieval system, which integrates data from major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of GenBank database are available by FTP.

  2. d

    GenBank

    • catalog.data.gov
    • datadiscovery.nlm.nih.gov
    • +3more
    Updated Jul 17, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Library of Medicine (2025). GenBank [Dataset]. https://catalog.data.gov/dataset/genbank-14853
    Explore at:
    Dataset updated
    Jul 17, 2025
    Dataset provided by
    National Library of Medicine
    Description

    NIH Genetic sequence database; an annotated collection of all publicly available DNA sequences.

  3. b

    Nucleotide Sequence Database

    • bioregistry.io
    • identifiers.org
    Updated Apr 9, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Nucleotide Sequence Database [Dataset]. https://bioregistry.io/insdc
    Explore at:
    Dataset updated
    Apr 9, 2022
    Description

    The International Nucleotide Sequence Database Collaboration (INSDC) consists of a joint effort to collect and disseminate databases containing DNA and RNA sequences.

  4. n

    NCBI Protein Database

    • neuinfo.org
    • rrid.site
    • +2more
    Updated Feb 1, 2001
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2001). NCBI Protein Database [Dataset]. http://identifiers.org/RRID:SCR_003257
    Explore at:
    Dataset updated
    Feb 1, 2001
    Description

    Databases of protein sequences and 3D structures of proteins. Collection of sequences from several sources, including translations from annotated coding regions in GenBank, RefSeq and TPA, as well as records from SwissProt, PIR, PRF, and PDB.

  5. n

    NCBI Genome Survey Sequences Database

    • neuinfo.org
    • scicrunch.org
    • +2more
    Updated Sep 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). NCBI Genome Survey Sequences Database [Dataset]. http://identifiers.org/RRID:SCR_002146
    Explore at:
    Dataset updated
    Sep 15, 2024
    Description

    Database of unannotated short single-read primarily genomic sequences from GenBank including random survey sequences clone-end sequences and exon-trapped sequences. The GSS division of GenBank is similar to the EST division, with the exception that most of the sequences are genomic in origin, rather than cDNA (mRNA). It should be noted that two classes (exon trapped products and gene trapped products) may be derived via a cDNA intermediate. Care should be taken when analyzing sequences from either of these classes, as a splicing event could have occurred and the sequence represented in the record may be interrupted when compared to genomic sequence. The GSS division contains (but is not limited to) the following types of data: * random single pass read genome survey sequences. * cosmid/BAC/YAC end sequences * exon trapped genomic sequences * Alu PCR sequences * transposon-tagged sequences Although dbGSS sequences are incorporated into the GSS Division of GenBank, annotation in dbGSS is more comprehensive and includes detailed information about the contributors, experimental conditions, and genetic map locations.

  6. Genome Sequence Data Set01

    • catalog.data.gov
    Updated Nov 12, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). Genome Sequence Data Set01 [Dataset]. https://catalog.data.gov/dataset/genome-sequence-data-set01-90ee2
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    The fasta files (Genome_Set01.zip) contain the reference-assisted de novo assemblies (as contigs) of four Campylobacter spp. isolates. The table contains rows as isolates (yellow) and columns as attributes (green) for each individual genome. This dataset is associated with the following publication: Gomez-Alvarez, V., N. Ashbolt, J. Griffith, J. Santo Domingo, and J. Lu. Whole-Genome Sequencing of Four Campylobacter strains Isolated from Gull Excreta collected from Hobie Beach (Oxnard, CA, USA). Microbiology Resource Announcements. American Society for Microbiology, Washington, DC, USA, 8(32): e00560-19, (2019).

  7. d

    High Throughput Genomic Sequences Division

    • dknet.org
    • scicrunch.org
    • +1more
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). High Throughput Genomic Sequences Division [Dataset]. http://identifiers.org/RRID:SCR_002150/resolver
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    Database of high-throughput genome sequences from large-scale genome sequencing centers, including unfinished and finished sequences. It was created to accommodate a growing need to make unfinished genomic sequence data rapidly available to the scientific community in a coordinated effort among the International Nucleotide Sequence databases, DDBJ, EMBL, and GenBank. Sequences are prepared for submission by using NCBI's software tools Sequin or tbl2asn. Each center has an FTP directory into which new or updated sequence files are placed. Sequence data in this division are available for BLAST homology searches against either the htgs database or the month database, which includes all new submissions for the prior month. Unfinished HTG sequences containing contigs greater than 2 kb are assigned an accession number and deposited in the HTG division. A typical HTG record might consist of all the first-pass sequence data generated from a single cosmid, BAC, YAC, or P1 clone, which together make up more than 2 kb and contain one or more gaps. A single accession number is assigned to this collection of sequences, and each record includes a clear indication of the status (phase 1 or 2) plus a prominent warning that the sequence data are unfinished and may contain errors. The accession number does not change as sequence records are updated; only the most recent version of a HTG record remains in GenBank.

  8. GenBank

    • integbio.jp
    • bioregistry.io
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NCBI (National Center for Biotechnology Information), GenBank [Dataset]. https://integbio.jp/dbcatalog/en/record/nbdc00276?jtpl=56
    Explore at:
    Dataset provided by
    National Center for Biotechnology Informationhttp://www.ncbi.nlm.nih.gov/
    License

    http://www.ncbi.nlm.nih.gov/genbank/http://www.ncbi.nlm.nih.gov/genbank/

    Description

    This is a public database of DNA sequences with annotations. It is a part of the International Nucleotide Sequence Database Collaboration, and cooperates with the DNA DataBank of Japan (DDBJ) as well as the European Molecular Biology Laboratory (EMBL).

  9. k

    The tpm metabarcoding DNA sequence database for taxonomic allocations using...

    • dataon.kisti.re.kr
    • data.niaid.nih.gov
    Updated Jun 23, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    POZZI Adrien C.M.;MARJOLET Laurence;COURNOYER Benoît (2021). The tpm metabarcoding DNA sequence database for taxonomic allocations using RDP classifier implemented in DADA2. [Dataset]. https://dataon.kisti.re.kr/search/78bdd4325edd4066e88f23e87f192507
    Explore at:
    Dataset updated
    Jun 23, 2021
    Authors
    POZZI Adrien C.M.;MARJOLET Laurence;COURNOYER Benoît
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The tpm metabarcoding DNA sequence database for taxonomic allocations using the Mothur and DADA2 bio-informatic tools A.C.M. Pozzi1, R. Bouchali1, L. Marjolet1, B. Cournoyer1 1 University of Lyon, UMR Ecologie Microbienne Lyon (LEM), CNRS 5557, INRAE 1418, Université Claude Bernard Lyon 1, VetAgro Sup, Research Team “Bacterial Opportunistic Pathogens and Environment” (BPOE), 69280 Marcy L’Etoile, France. Corresponding authors: A.C.M. Pozzi, UMR Microbial Ecology, CNRS 5557, CNRS 1418, VetAgro Sup, Main building, aisle 3, 1st floor, 69280 Marcy-L’Etoile, France. Tel. (+33) 478 87 39 47. Fax. (+33) 472 43 12 23. Email: adrien.meynier_pozzi@vetagro-sup.fr B. Cournoyer, UMR Microbial Ecology, CNRS 5557, CNRS 1418, VetAgro Sup, Main building, aisle 3, 1st floor, 69280 Marcy-L’Etoile, France. Tel. (+33) 478 87 56 47. Fax. (+33) 472 43 12 23. Email: and benoit.cournoyer@vetagro-sup.fr Keywords: BACtpm, Bacteria, tpm, thiopurine-S-methyltransferase EC:2.1.1.67, Nucleotide sequences, PCR products, Next-Generation-Sequencing, OTHU Description: The tpm gene codes for the thiopurine-S-methyltransferase (TPMT), an enzyme that can detoxify metalloid-containing oxyanions and xenobiotics (Cournoyer et al., 1998). Bacterial TPMTs radiated apart from human and animal TPMTs, and showed a vertical evolution in line with the 16S rRNA gene molecular phylogeny (Favre‐Bonté et al., 2005). The tpm database, named BACtpm, was designed to apply the tpm-metabarcoding analytical scheme published in Aigle et al. (2021). It includes the full tpm identifiers, GenBank accession numbers, complete taxonomic records (domain down to strain code) of about 215 nucleotide-long tpm sequences of 840 unique taxa belonging to 139 genera. Nucleotide sequences of tpm (range: 190-233 nucleotides) were either retrieved from public repositories (GenBank) or made available by B. Cournoyer’s research group. Colin et al. (2020) described the PCR and high throughput Illumina Miseq DNA sequencing procedures used to produce tpm sequences. BACtpm v.2.0.1 (June 2021 release) is made available under the Creative Commons Attribution 4.0 International Licence. It can be used for the taxonomic allocations of tpm sequences down to the species and strain levels. Data is stored in the csv format enabling future user to reformat it to fit their specific needs. Acknowledgments: We thank the worldwide community of microbiologists who made contributions to public databases in the past decades, and made possible the elaboration of the BACtpm database. We also thank the Field Observatory in Urban Hydrology (OTHU, www.graie.org/othu/), Labex IMU (Intelligence des Mondes Urbains), the Greater Lyon Urban Community, the School of Integrated Watershed Sciences H2O'LYON, and the Lyon Urban School for their support in the development of this database. This work was funded by the French national research program for environmental and occupational health of ANSES under the terms of project “Iouqmer” EST 2016/1/120, l'Agence Nationale de la Recherche through ANR-16-CE32-0006, ANR-17-CE04-0010, ANR-17-EURE-0018 and ANR-17-CONV-0004, by the MITI CNRS project named Urbamic, and the French water agency for the Rhône, Mediterranean and Corsica areas through the Desir and DOmic projects. We thank former BPOE lab members who contributed to start and expand the BACtpm database: Céline COLINON, Romain MARTI, Emilie BOURGEOIS, Sébastien RIBUN and Yannick COLIN. References: Aigle, A., Colin, Y., Bouchali, R., Bourgeois, E., Marti, R., Ribun, S., Marjolet, L., Pozzi, A.C.M., Misery, B., Colinon, C., Bernardin-Souibgui, C., Wiest, L., Blaha, D., Galia, W., Cournoyer, B., 2021. Spatio-temporal variations in chemical pollutants found among urban deposits match changes in thiopurine S-methyltransferase-harboring bacteria tracked by the tpm metabarcoding approach. Sci. Total Environ. 767, 145425. https://doi.org/10.1016/j.scitotenv.2021.145425 Colin, Y., Bouchali, R., Marjolet, L., Marti, R., Vautrin, F., Voisin, J., Bourgeois, E., Rodriguez-Nava, V., Blaha, D., Winiarski, T., Mermillod-Blondin, F., Cournoyer, B., 2020. Coalescence of bacterial groups originating from urban runoffs and artificial infiltration systems among aquifer microbiomes. Hydrol. Earth Syst. Sci. 24, 4257–4273. https://doi.org/10.5194/hess-24-4257-2020 Cournoyer, B., Watanabe, S., Vivian, A., 1998. A tellurite-resistance genetic determinant from phytopathogenic pseudomonads encodes a thiopurine methyltransferase: evidence of a widely-conserved family of methyltransferases1The International Collaboration (IC) accession number of the DNA sequence is L49178.1. Biochim. Biophys. Acta BBA - Gene Struct. Expr. 1397, 161–168. https://doi.org/10.1016/S0167-4781(98)00020-7 Favre‐Bonté, S., Ranjard, L., Colinon, C., Prigent‐Combaret, C., Nazaret, S., Cournoyer, B., 2005. Freshwater selenium-methylating bacterial thiopurine methyltransferases: diversity and molecular phylogeny. Environ. Microbiol. 7, 153–164. https://doi.org/10.1111/j.1462-2920.2004.00670.x;Change Log; [2.0.1] - 2021-06-23: tpm nucleotide sequences now provided in two separated columns, either aligned with gaps for repeatable use in Mothur or not aligned and without gaps for use with DADA2. [2.1.1] - 2023-10-10: tpm nucleotide sequences added for 20 taxa (Actinoplanes sp. N902-109, Ancylobacter polymorphus DSM2457, Aromatoleum toluclasticum ATCC700605, Aromatoleum bremense PbN1, Aromatoleum diolicum, Candidatus_Macondimonas diazotrophica, Collimonas sp. PAH2, Collimonas humicolas, Emcibacter nanhaiensis CGMCC112471, Leptospira yasudae, Lysobacter sp. TY298, Lysobacter spongiae KACC19276, Lysobacter sp. CF310, Nitrospira sp. ND1, Pseudanabaena biceps PCC7429, Pseudomonas eucalypticola NP1, Pseudomonas alcaligenes MB-090714 , Pseudomonas peli DSM17833, Pseudomonas sp. 9AZ, and Pseudomonas sp. NFACC02), Proteobacteria updated to Pseudomonadota, database formatted uniquely for use with RDP/dada2.

  10. Accepted species list of Eurotiales, including a DNA sequence reference...

    • zenodo.org
    Updated Jul 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cobus M Visagie; Cobus M Visagie; David Overy; David Overy; Jos Houbraken; Jos Houbraken; František Sklenář; František Sklenář; Bensch Konstanze; Jens Frisvad; Jens Frisvad; Jonathan Mack; Giancarlo Perrone; Giancarlo Perrone; Robert A. Samson; Robert A. Samson; Nicole Van Vuuren; Neriman Yilmaz; Neriman Yilmaz; Vit Hubka; Vit Hubka; Bensch Konstanze; Jonathan Mack; Nicole Van Vuuren (2025). Accepted species list of Eurotiales, including a DNA sequence reference database, as curated by the International Commission of Penicillium and Aspergillus (ICPA) [Dataset]. http://doi.org/10.5281/zenodo.16607355
    Explore at:
    Dataset updated
    Jul 31, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Cobus M Visagie; Cobus M Visagie; David Overy; David Overy; Jos Houbraken; Jos Houbraken; František Sklenář; František Sklenář; Bensch Konstanze; Jens Frisvad; Jens Frisvad; Jonathan Mack; Giancarlo Perrone; Giancarlo Perrone; Robert A. Samson; Robert A. Samson; Nicole Van Vuuren; Neriman Yilmaz; Neriman Yilmaz; Vit Hubka; Vit Hubka; Bensch Konstanze; Jonathan Mack; Nicole Van Vuuren
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Eurotiales is a diverse and speciose order and includes economically important genera like Aspergillus, Penicillium, Paecilomyces and Talaromyces. Historically, species identifications based on morphology are challenging. The publication of accepted species lists and the availability of representative DNA sequences for type strains have contributed greatly towards accurate species identification and facilitated the description of many new species. However, despite current advancements, a proportion of newly described species within these taxonomically challenging genera represent, in fact, existing species, which raises obvious concerns.

    This study thus aimed to further modernise the taxonomy of Eurotiales by addressing key challenges in species identification and classification. Our study objectives were threefold: to review species described after 2023, update the accepted species list, and release a curated DNA sequence dataset to facilitate future species identifications. We conclude that a move to a phylogenetic species concept is necessary but continue to support the inclusion of morphological descriptions and, where possible, associated secondary metabolite, exoenzyme, physiology and ecological data when introducing new species.

    Our list now contains 1393 species classified into four families and 26 genera, with Aspergillus (n=465), Penicillium (n=598) and Talaromyces (n=236) containing the most species. To aid sequence-based identifications and species descriptions under a phylogenetic species concept, we release a curated DNA reference sequence database containing 18837 DNA sequences (3867 ITS, 5277 BenA, 5110 CaM and 4583 RPB2) generated from 5325 strains. Sequences were selected to best cover the infraspecies variation under our current understanding of each species. The sequence database will be kept up to date as new information becomes available. This manuscript presents a major leap towards our goal to facilitate work with Eurotiales, while providing the taxonomic framework to support research excellence related to this important fungal group.

    This dataset is curated and kept up to date by the International Commission of Penicillium and Aspergillus (ICPA). If you have questions or suggestions, please get in contact with ICPA members.

  11. d

    ZooGene - A DNA Sequence Database for Calanoid Copepods and Euphausiids

    • catalog.data.gov
    • cmr.earthdata.nasa.gov
    Updated Jan 1, 2002
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of New Hampshire (Point of Contact) (2002). ZooGene - A DNA Sequence Database for Calanoid Copepods and Euphausiids [Dataset]. https://catalog.data.gov/km/dataset/zoogene-a-dna-sequence-database-for-calanoid-copepods-and-euphausiids
    Explore at:
    Dataset updated
    Jan 1, 2002
    Dataset provided by
    University of New Hampshire (Point of Contact)
    Description

    An international partnership created a zooplankton genomic (ZooGene) database of DNA type sequences for calanoid copepods and euphausiids. The ZooGene database was designed to include all species of these groups and to allow expansion to additional zooplankton groups. The ZooGene partnership includes four P.I.s and thirteen expert taxonomic consultants from seven countries. Zooplankton samples are sorted from existing archival collections, obtained in coordination with planned oceanographic research efforts, and collected during National Marine Fisheries Service field surveys. The taxonomic experts confirm species' identifications; DNA sequencing is done at the University of New Hampshire and, in some cases, in other partners' laboratories. For each species, a DNA type sequence is determined for a portion of the mitochondrial cytochrome oxidase I (mtCOI) gene; multiple mtCOI sequences are included as necessary to reflect intraspecific variation. The ZooGene database is designed, created, managed, maintained, and distributed as part of the proposed work; the data is integrated into the Ocean Biogeographical Information System (OBIS).

  12. n

    Chloroplast Genome Database

    • neuinfo.org
    • rrid.site
    • +2more
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Chloroplast Genome Database [Dataset]. http://identifiers.org/RRID:SCR_013421
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    The Chloroplast Genome Database contains annotated chloroplast/plastid genomes from the NCBI Organelle Genomes section at NCBI. Users can search for genes by their annotated names, conduct flexible BLAST searches, download protein and nucleotide sequences extracted from a selected chloroplast genome, and browse the putative protein families (tribes) created using TribeMCL.

  13. Genome Sequence Data Set01

    • catalog.data.gov
    • data.amerigeoss.org
    Updated Nov 12, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). Genome Sequence Data Set01 [Dataset]. https://catalog.data.gov/dataset/genome-sequence-data-set01-d2862
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    The fasta files (Genome_Set01.zip) contain the reference-assisted de novo assemblies (as contigs) of three Escherichia coli isolates. The table contains rows as isolates (yellow) and columns as attributes (green) for each individual genome. This dataset is associated with the following publication: Gomez-Alvarez, V., and J. Hoelle-Schwalbach. Draft Genome Sequences of Antibiotic-Resistant Escherichia coli Isolates from U.S. Wastewater Treatment Plants. Microbiology Resource Announcements. American Society for Microbiology, Washington, DC, USA, 8(23): e00351-19, (2019).

  14. DNA Sequence Prediction

    • kaggle.com
    zip
    Updated Jul 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harshvardhan21 (2025). DNA Sequence Prediction [Dataset]. https://www.kaggle.com/datasets/harshvardhan21/dna-sequence-prediction/data
    Explore at:
    zip(5010968 bytes)Available download formats
    Dataset updated
    Jul 8, 2025
    Authors
    Harshvardhan21
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    DNA sequence prediction is a crucial task in bioinformatics, enabling researchers to analyze genetic patterns, predict mutations, and model gene structures. This dataset can be used to implement three machine learning approaches to predict nucleotide sequences: N-Gram, LSTM, and Transformer models.

    We use nucleotide sequences of human genes from the NCBI Gene Database. The dataset consists of: 1. Gene symbols, descriptions, and types. 2. Nucleotide sequences represented as A, T, C, G. 3. Train-validation split: 80% training, 20% testing.

  15. n

    Genome Reviews

    • neuinfo.org
    • dknet.org
    • +2more
    Updated Oct 31, 2005
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2005). Genome Reviews [Dataset]. http://identifiers.org/RRID:SCR_007685
    Explore at:
    Dataset updated
    Oct 31, 2005
    Description

    THIS RESOURCE IS NO LONGER IN SERVICE, documented April 24, 2017. The Genome Reviews database provides an up-to-date, standardized and comprehensively annotated view of the genomic sequence of organisms with completely deciphered genomes. Currently, Genome Reviews contains the genomes of archaea, bacteria, bacteriophages and selected eukaryota. Genome Reviews is available as a MySQL relational database, or a flat file format derived from that in the EMBL Nucleotide Sequence Database. An Ensembl-style browser is now available for Genome Reviews, providing a zoomable graphical view of all chromosomes and plasmids represented in the database. The location and structure of all genes is shown and the distribution of features throughout the sequence is displayed.

  16. s

    DNA DataBank of Japan (DDBJ)

    • scicrunch.org
    • neuinfo.org
    • +1more
    Updated Oct 24, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2016). DNA DataBank of Japan (DDBJ) [Dataset]. http://identifiers.org/RRID:SCR_002359)
    Explore at:
    Dataset updated
    Oct 24, 2016
    Description

    Maintains and provides archival, retrieval and analytical resources for biological information. Central DDBJ resource consists of public, open-access nucleotide sequence databases including raw sequence reads, assembly information and functional annotation. Database content is exchanged with EBI and NCBI within the framework of the International Nucleotide Sequence Database Collaboration (INSDC). In 2011, DDBJ launched two new resources: DDBJ Omics Archive and BioProject. DOR is archival database of functional genomics data generated by microarray and highly parallel new generation sequencers. Data are exchanged between the ArrayExpress at EBI and DOR in the common MAGE-TAB format. BioProject provides organizational framework to access metadata about research projects and data from projects that are deposited into different databases.

  17. n

    NEON (National Ecological Observatory Network) Fish sequences DNA barcode...

    • data.neonscience.org
    zip
    Updated Dec 15, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). NEON (National Ecological Observatory Network) Fish sequences DNA barcode (DP1.20105.001) [Dataset]. https://data.neonscience.org/data-products/DP1.20105.001
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 15, 2024
    License

    https://www.neonscience.org/data-samples/data-policies-citationhttps://www.neonscience.org/data-samples/data-policies-citation

    Time period covered
    Nov 2017 - Dec 2024
    Area covered
    LECO, WLOU, POSE, BLDE, GUIL, HOPB, TECR, LIRO, CUPE, SYCA
    Description

    COI DNA sequences from select fish in lakes and wadeable streams

  18. r

    T4-like genome database

    • rrid.site
    • dknet.org
    • +1more
    Updated Dec 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). T4-like genome database [Dataset]. http://identifiers.org/RRID:SCR_005367
    Explore at:
    Dataset updated
    Dec 4, 2023
    Description

    THIS RESOURCE IS NO LONGER IN SERVICE, documented August 22, 2016. A database of information on bacterial phages. It contains multiple phage genomes, which users can BLAST and MegaBLAST, and also hosts a Phage Forum in which users can discuss phage data. Interactive browsing of completed phage genomes is available using the program. The browser allows users to scan the genome for particular features and to download sequence information plus analyses of those features. Views of the genome are generated showing named genes BLAST similarities to other phages predicted tRNAs and other sequence features.

  19. r

    Pseudomonas Genome Database

    • rrid.site
    Updated Jul 18, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). Pseudomonas Genome Database [Dataset]. http://identifiers.org/RRID:SCR_006590
    Explore at:
    Dataset updated
    Jul 18, 2018
    Description

    Database of peer-reviewed, continually updated annotation for the Pseudomonas aeruginosa PAO1 reference strain genome expanded to include all Pseudomonas species to facilitate cross-strain and cross-species genome comparisons with high quality comparative genomics. The database contains robust assessment of orthologs, a novel ortholog clustering method, and incorporates five views of the data at the sequence and annotation levels (Gbrowse, Mauve and custom views) to facilitate genome comparisons. Other features include more accurate protein subcellular localization predictions and a user-friendly, Boolean searchable log file of updates for the reference strain PAO1. The current annotation is updated using recent research literature and peer-reviewed submissions by a worldwide community of PseudoCAP (Pseudomonas aeruginosa Community Annotation Project) participating researchers. If you are interested in participating, you are invited to get involved. Many annotations, DNA sequences, Orthologs, Intergenic DNA, and Protein sequences are available for download.

  20. n

    mtDB - Human Mitochondrial Genome Database

    • neuinfo.org
    • dknet.org
    • +2more
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). mtDB - Human Mitochondrial Genome Database [Dataset]. http://identifiers.org/RRID:SCR_002945
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    A database of human mitochondrial genomes containing mtDNA sequences, polymorphic sites, and the ability to search for specific variants. It contains 1865 complete sequences and 839 coding region sequences.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2024). GenBank [Dataset]. http://identifiers.org/RRID:SCR_002760

GenBank

RRID:SCR_002760, nif-0000-02873, r3d100010528, OMICS_01650, GenBank (RRID:SCR_002760), GB, Gen Bank, GenBank

Explore at:
53 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Sep 17, 2024
Description

NIH genetic sequence database that provides annotated collection of all publicly available DNA sequences for almost 280 000 formally described species (Jan 2014) .These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole-genome shotgun (WGS) and environmental sampling projects. Most submissions are made using web-based BankIt or standalone Sequin programs, and GenBank staff assigns accession numbers upon data receipt. It is part of International Nucleotide Sequence Database Collaboration and daily data exchange with European Nucleotide Archive (ENA) and DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through NCBI Entrez retrieval system, which integrates data from major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of GenBank database are available by FTP.

Search
Clear search
Close search
Google apps
Main menu