100+ datasets found

n
GenBank
neuinfo.org
Updated Sep 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). GenBank [Dataset]. http://identifiers.org/RRID:SCR_002760
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_002760
Dataset updated
Sep 17, 2024
Description
NIH genetic sequence database that provides annotated collection of all publicly available DNA sequences for almost 280 000 formally described species (Jan 2014) .These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole-genome shotgun (WGS) and environmental sampling projects. Most submissions are made using web-based BankIt or standalone Sequin programs, and GenBank staff assigns accession numbers upon data receipt. It is part of International Nucleotide Sequence Database Collaboration and daily data exchange with European Nucleotide Archive (ENA) and DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through NCBI Entrez retrieval system, which integrates data from major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of GenBank database are available by FTP.
n
T4-like genome database
neuinfo.org
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). T4-like genome database [Dataset]. http://identifiers.org/RRID:SCR_005367
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_005367
Dataset updated
Jan 29, 2022
Description
THIS RESOURCE IS NO LONGER IN SERVICE, documented August 22, 2016. A database of information on bacterial phages. It contains multiple phage genomes, which users can BLAST and MegaBLAST, and also hosts a Phage Forum in which users can discuss phage data. Interactive browsing of completed phage genomes is available using the program. The browser allows users to scan the genome for particular features and to download sequence information plus analyses of those features. Views of the genome are generated showing named genes BLAST similarities to other phages predicted tRNAs and other sequence features.
DNA sequencing raw data and analytical results by bioinformatics for column...
catalog.data.gov
Updated Nov 12, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Research and Development (ORD) (2020). DNA sequencing raw data and analytical results by bioinformatics for column study on algal roganic matter impact. [Dataset]. https://catalog.data.gov/dataset/dna-sequencing-raw-data-and-analytical-results-by-bioinformatics-for-column-study-on-algal
Explore at:
Dataset updated
Nov 12, 2020
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description
The excel spreadsheet includes sample IDs and labeling information for DNA sequencing raw data. In addition, DNA concentrations for all the biofilm samples analyzed are presented. This dataset is associated with the following publication: Jeon, Y., l. li, J. Calvillo, H. Ryu, J. Santo Domingo, O. Choi, J. Brown, and Y. Seo. Impact of algal organic matter on the performance, cyanotoxin removal, and biofilms of biologically-active filtration systems. WATER RESEARCH. Elsevier Science Ltd, New York, NY, USA, 184: 116120, (2020).
r
DNA sequencing data in MDS-patients treated with allogeneic transplantation
researchdata.se
Updated Dec 18, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tetsuichi Yoshizato; Sten Eirik Jacobsen (2023). DNA sequencing data in MDS-patients treated with allogeneic transplantation [Dataset]. http://doi.org/10.48723/0k7w-2k05
Explore at:
(546461), (1535)Available download formats
Unique identifier
https://doi.org/10.48723/0k7w-2k05
Dataset updated
Dec 18, 2023
Dataset provided by
Karolinska Institutet
Authors
Tetsuichi Yoshizato; Sten Eirik Jacobsen
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
The deposited data consists of 47 bam files for targeted DNA sequencing and somatic mutation list called by bulk whole-genome sequencing in patients with myelodysplastic syndromes or related myeloid malignancies who received allogeneic stem cell transplantation. The objective of this data collection was to assess whether somatic mutation can be a marker for detecting early relapse. DNA sequencing was performed to identify somatic mutation candidates using samples collected at diagnosis and also performed for the comparison of the sensitivity of detection between digital droplet PCR and next-generation sequencing. We used three different gene panels for targeted DNA sequencing and the gene lists can be found in the cited Blood paper. Read alignment was performed against the GRCh37. In two patients in whom no recurrent driver mutations were identified, whole genome sequencing was performed. After alignment to GRCh37, somatic mutations were called using Genomon2, and identified somatic mutation list was deposited.

The total size of the deposited data is approximately 25 GB (24586652781 bytes).
u
Statistics of the chloroplast genome sequencing data
figshare.unimelb.edu.au
pdf
Updated Jul 14, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chenxi Zhou (2020). Statistics of the chloroplast genome sequencing data [Dataset]. http://doi.org/10.26188/12652067.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.26188/12652067.v1
Dataset updated
Jul 14, 2020
Dataset provided by
The University of Melbourne
Authors
Chenxi Zhou
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Statistics of the total DNA sequencing data, the raw cpDNA sequence data extracted from the total DNA sequence data and the processed cpDNA sequence data after trimming (for Illumina reads) or error correction (for Nanopore reads).
f
Table_1_Cross-sectional use of barcode of life data system and GenBank as...
figshare.com
docx
Updated Jun 13, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Takeru Nakazato; Utsugi Jinbo (2023). Table_1_Cross-sectional use of barcode of life data system and GenBank as DNA barcoding databases for the advancement of museomics.DOCX [Dataset]. http://doi.org/10.3389/fevo.2022.966605.s003
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.3389/fevo.2022.966605.s003
Dataset updated
Jun 13, 2023
Dataset provided by
Frontiers
Authors
Takeru Nakazato; Utsugi Jinbo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Museomics is an approach to the DNA sequencing of museum specimens that can generate both biodiversity and sequence information. In this study, we surveyed both the biodiversity information-based database BOLD (Barcode of Life System) and the sequence information database GenBank, by using DNA barcoding data as an example, with the aim of integrating the data from these two databases. DNA barcoding is a method of identifying species from DNA sequences by using short genetic markers. We surveyed how many entries had biodiversity information (such as links to BOLD and specimen IDs) by downloading all fish, insect, and flowering plant data available from the GenBank Nucleotide, and BOLD ID was assigned to 26.2% of entries for insects. In the same way, we downloaded the respective BOLD data and checked the status of links to sequence information. We also investigated how many species do these databases cover, and 7,693 species were found to exist only in BOLD. In the future, as museomics develops as a field, the targeted sequences will be extended not only to DNA barcodes, but also to mitochondrial genomes, other genes, and genome sequences. Consequently, the value of the sequence data will increase. In addition, various species will be sequenced and, thus, biodiversity information such as the evidence specimen photographs used as a basis for species identification, will become even more indispensable. This study contributes to the acceleration of museomics-associated research by using databases in a cross-sectional manner.
Whole genome sequencing data for spirit collection specimens
data.csiro.au
Updated Sep 3, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Clare Holleley; Marina Alexander; Alicia Grealy (2021). Whole genome sequencing data for spirit collection specimens [Dataset]. http://doi.org/10.25919/583e-2j57
Explore at:
Unique identifier
https://doi.org/10.25919/583e-2j57
Dataset updated
Sep 3, 2021
Dataset provided by
CSIROhttps://www.csiro.au/
Authors
Clare Holleley; Marina Alexander; Alicia Grealy
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Dataset funded by
CSIROhttps://www.csiro.au/
Description
Raw Illumina short read sequencing data derived from genomic DNA extractions from formalin and ethanol preserved tissue. Tissues were harvested from twelve spirit vault specimens from the Australian National Wildlife Collection. Nine of the twelve specimens were formalin-preserved and the remaining three (wedge-tailed eagle, great black cormorant and koala) were ethanol preserved. Three DNA extraction methods and two library preparation methods were tested.

Species: wedge-tailed eagle (Aquila audax), saltwater crocodile (Crocodylus porosus), Australian kestrel (Falco cenchroides), tammar wallaby (Macropus eugenii), budgie (Melopsittacus undulatus), platypus (Ornithorhynchus anatinus), great black cormorant (Phalacrocorax carbo), koala (Phascolarctos cinereus), dwarf bearded dragon (Pogona minima), bearded dragon (Pogona vitticeps), cane toad (Rhinella marinus), zebra finch (Taeniopygia guttata) Lineage: Sample metadata and methods used are included in the dataset.
n
NEON (National Ecological Observatory Network) Fish sequences DNA barcode...
data.neonscience.org
zip
Updated Dec 15, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). NEON (National Ecological Observatory Network) Fish sequences DNA barcode (DP1.20105.001) [Dataset]. https://data.neonscience.org/data-products/DP1.20105.001
Explore at:
zipAvailable download formats
Dataset updated
Dec 15, 2024
License
https://www.neonscience.org/data-samples/data-policies-citationhttps://www.neonscience.org/data-samples/data-policies-citation
Time period covered
Nov 2017 - Dec 2024
Area covered
LECO, WLOU, CUPE, HOPB, POSE, BLDE, GUIL, SYCA, TECR, LIRO
Description
COI DNA sequences from select fish in lakes and wadeable streams
k
The tpm metabarcoding DNA sequence database for taxonomic allocations using...
dataon.kisti.re.kr
Updated Jun 23, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
POZZI Adrien C.M.;MARJOLET Laurence;COURNOYER Benoît (2021). The tpm metabarcoding DNA sequence database for taxonomic allocations using RDP classifier implemented in DADA2. [Dataset]. https://dataon.kisti.re.kr/search/78bdd4325edd4066e88f23e87f192507
Explore at:
Dataset updated
Jun 23, 2021
Authors
POZZI Adrien C.M.;MARJOLET Laurence;COURNOYER Benoît
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The tpm metabarcoding DNA sequence database for taxonomic allocations using the Mothur and DADA2 bio-informatic tools A.C.M. Pozzi1, R. Bouchali1, L. Marjolet1, B. Cournoyer1 1 University of Lyon, UMR Ecologie Microbienne Lyon (LEM), CNRS 5557, INRAE 1418, Université Claude Bernard Lyon 1, VetAgro Sup, Research Team “Bacterial Opportunistic Pathogens and Environment” (BPOE), 69280 Marcy L’Etoile, France. Corresponding authors: A.C.M. Pozzi, UMR Microbial Ecology, CNRS 5557, CNRS 1418, VetAgro Sup, Main building, aisle 3, 1st floor, 69280 Marcy-L’Etoile, France. Tel. (+33) 478 87 39 47. Fax. (+33) 472 43 12 23. Email: adrien.meynier_pozzi@vetagro-sup.fr B. Cournoyer, UMR Microbial Ecology, CNRS 5557, CNRS 1418, VetAgro Sup, Main building, aisle 3, 1st floor, 69280 Marcy-L’Etoile, France. Tel. (+33) 478 87 56 47. Fax. (+33) 472 43 12 23. Email: and benoit.cournoyer@vetagro-sup.fr Keywords: BACtpm, Bacteria, tpm, thiopurine-S-methyltransferase EC:2.1.1.67, Nucleotide sequences, PCR products, Next-Generation-Sequencing, OTHU Description: The tpm gene codes for the thiopurine-S-methyltransferase (TPMT), an enzyme that can detoxify metalloid-containing oxyanions and xenobiotics (Cournoyer et al., 1998). Bacterial TPMTs radiated apart from human and animal TPMTs, and showed a vertical evolution in line with the 16S rRNA gene molecular phylogeny (Favre‐Bonté et al., 2005). The tpm database, named BACtpm, was designed to apply the tpm-metabarcoding analytical scheme published in Aigle et al. (2021). It includes the full tpm identifiers, GenBank accession numbers, complete taxonomic records (domain down to strain code) of about 215 nucleotide-long tpm sequences of 840 unique taxa belonging to 139 genera. Nucleotide sequences of tpm (range: 190-233 nucleotides) were either retrieved from public repositories (GenBank) or made available by B. Cournoyer’s research group. Colin et al. (2020) described the PCR and high throughput Illumina Miseq DNA sequencing procedures used to produce tpm sequences. BACtpm v.2.0.1 (June 2021 release) is made available under the Creative Commons Attribution 4.0 International Licence. It can be used for the taxonomic allocations of tpm sequences down to the species and strain levels. Data is stored in the csv format enabling future user to reformat it to fit their specific needs. Acknowledgments: We thank the worldwide community of microbiologists who made contributions to public databases in the past decades, and made possible the elaboration of the BACtpm database. We also thank the Field Observatory in Urban Hydrology (OTHU, www.graie.org/othu/), Labex IMU (Intelligence des Mondes Urbains), the Greater Lyon Urban Community, the School of Integrated Watershed Sciences H2O'LYON, and the Lyon Urban School for their support in the development of this database. This work was funded by the French national research program for environmental and occupational health of ANSES under the terms of project “Iouqmer” EST 2016/1/120, l'Agence Nationale de la Recherche through ANR-16-CE32-0006, ANR-17-CE04-0010, ANR-17-EURE-0018 and ANR-17-CONV-0004, by the MITI CNRS project named Urbamic, and the French water agency for the Rhône, Mediterranean and Corsica areas through the Desir and DOmic projects. We thank former BPOE lab members who contributed to start and expand the BACtpm database: Céline COLINON, Romain MARTI, Emilie BOURGEOIS, Sébastien RIBUN and Yannick COLIN. References: Aigle, A., Colin, Y., Bouchali, R., Bourgeois, E., Marti, R., Ribun, S., Marjolet, L., Pozzi, A.C.M., Misery, B., Colinon, C., Bernardin-Souibgui, C., Wiest, L., Blaha, D., Galia, W., Cournoyer, B., 2021. Spatio-temporal variations in chemical pollutants found among urban deposits match changes in thiopurine S-methyltransferase-harboring bacteria tracked by the tpm metabarcoding approach. Sci. Total Environ. 767, 145425. https://doi.org/10.1016/j.scitotenv.2021.145425 Colin, Y., Bouchali, R., Marjolet, L., Marti, R., Vautrin, F., Voisin, J., Bourgeois, E., Rodriguez-Nava, V., Blaha, D., Winiarski, T., Mermillod-Blondin, F., Cournoyer, B., 2020. Coalescence of bacterial groups originating from urban runoffs and artificial infiltration systems among aquifer microbiomes. Hydrol. Earth Syst. Sci. 24, 4257–4273. https://doi.org/10.5194/hess-24-4257-2020 Cournoyer, B., Watanabe, S., Vivian, A., 1998. A tellurite-resistance genetic determinant from phytopathogenic pseudomonads encodes a thiopurine methyltransferase: evidence of a widely-conserved family of methyltransferases1The International Collaboration (IC) accession number of the DNA sequence is L49178.1. Biochim. Biophys. Acta BBA - Gene Struct. Expr. 1397, 161–168. https://doi.org/10.1016/S0167-4781(98)00020-7 Favre‐Bonté, S., Ranjard, L., Colinon, C., Prigent‐Combaret, C., Nazaret, S., Cournoyer, B., 2005. Freshwater selenium-methylating bacterial thiopurine methyltransferases: diversity and molecular phylogeny. Environ. Microbiol. 7, 153–164. https://doi.org/10.1111/j.1462-2920.2004.00670.x;Change Log; [2.0.1] - 2021-06-23: tpm nucleotide sequences now provided in two separated columns, either aligned with gaps for repeatable use in Mothur or not aligned and without gaps for use with DADA2. [2.1.1] - 2023-10-10: tpm nucleotide sequences added for 20 taxa (Actinoplanes sp. N902-109, Ancylobacter polymorphus DSM2457, Aromatoleum toluclasticum ATCC700605, Aromatoleum bremense PbN1, Aromatoleum diolicum, Candidatus_Macondimonas diazotrophica, Collimonas sp. PAH2, Collimonas humicolas, Emcibacter nanhaiensis CGMCC112471, Leptospira yasudae, Lysobacter sp. TY298, Lysobacter spongiae KACC19276, Lysobacter sp. CF310, Nitrospira sp. ND1, Pseudanabaena biceps PCC7429, Pseudomonas eucalypticola NP1, Pseudomonas alcaligenes MB-090714 , Pseudomonas peli DSM17833, Pseudomonas sp. 9AZ, and Pseudomonas sp. NFACC02), Proteobacteria updated to Pseudomonadota, database formatted uniquely for use with RDP/dada2.
d
High Throughput Genomic Sequences Division
dknet.org
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). High Throughput Genomic Sequences Division [Dataset]. http://identifiers.org/RRID:SCR_002150/resolver
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_002150 https://identifiers.org/RRID:SCR_002150/resolver
Dataset updated
Jan 29, 2022
Description
Database of high-throughput genome sequences from large-scale genome sequencing centers, including unfinished and finished sequences. It was created to accommodate a growing need to make unfinished genomic sequence data rapidly available to the scientific community in a coordinated effort among the International Nucleotide Sequence databases, DDBJ, EMBL, and GenBank. Sequences are prepared for submission by using NCBI's software tools Sequin or tbl2asn. Each center has an FTP directory into which new or updated sequence files are placed. Sequence data in this division are available for BLAST homology searches against either the htgs database or the month database, which includes all new submissions for the prior month. Unfinished HTG sequences containing contigs greater than 2 kb are assigned an accession number and deposited in the HTG division. A typical HTG record might consist of all the first-pass sequence data generated from a single cosmid, BAC, YAC, or P1 clone, which together make up more than 2 kb and contain one or more gaps. A single accession number is assigned to this collection of sequences, and each record includes a clear indication of the status (phase 1 or 2) plus a prominent warning that the sequence data are unfinished and may contain errors. The accession number does not change as sequence records are updated; only the most recent version of a HTG record remains in GenBank.
r
DNA sequencing data for "Stable clonal contribution of lineage-restricted...
researchdata.se
Updated Nov 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tetsuichi Yoshizato; Sten Eirik Jacobsen (2025). DNA sequencing data for "Stable clonal contribution of lineage-restricted stem cells to human hematopoiesis" [Dataset]. http://doi.org/10.48723/313d-dd68
Explore at:
(7285), (160591), (28086), (24240), (2928)Available download formats
Unique identifier
https://doi.org/10.48723/313d-dd68
Dataset updated
Nov 27, 2025
Dataset provided by
Karolinska Institutet
Authors
Tetsuichi Yoshizato; Sten Eirik Jacobsen
Time period covered
2018 - 2024
Area covered
Sweden
Description
This dataset contains three types of DNA sequencing data. -Error-corrected DNA capture sequencing (ECTS) -Bulk whole-exome sequencing (WES) -Single-colony whole-genome sequencing (WGS) All sequencing was performed on an Illumina NovaSeq 6000 at the National Genomics Infrastructure in Stockholm, using paired-end sequencing mode.

ECTS Bone marrow mononuclear cells isolated from all 93 healthy donors were subjected to ECTS for identification of somatic mutations targeted to 23 genes encompassing the most recurrently mutated genes reported in clonal hematopoiesis.

WES BM MNC DNA isolated from the first visit from 20 healthy donors above 71 years was subjected bulk WES. Paired buccal swab DNA was used for normal controls.

Single colony WGS DNA extracted from 333 genotyped single colonies and 10 control buccal swabs from 10 donors was subjected to WGS.

The dataset consists of three folders: scWGS contains 343 files in CRAM format, totaling approximately 16.3 TiB (17.9 TB). WES contains 40 files in CRAM format, totaling approximately 650 GiB (710 GB). ECTS contains 117 files in CRAM format, totaling approximately 18 GiB (20 GB).
DNA Sequencing Market Growth Analysis - Size and Forecast 2024-2028 |...
technavio.com
pdf
Updated May 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Technavio (2024). DNA Sequencing Market Growth Analysis - Size and Forecast 2024-2028 | Technavio [Dataset]. https://www.technavio.com/report/dna-sequencing-market-industry-analysis
Explore at:
pdfAvailable download formats
Dataset updated
May 17, 2024
Dataset provided by
TechNavio
Authors
Technavio
License
https://www.technavio.com/content/privacy-noticehttps://www.technavio.com/content/privacy-notice
Time period covered
2024 - 2028
Description
snapshot-tab-pane DNA Sequencing Market Size 2024-2028The DNA sequencing market size is forecast to increase by USD 17.34 billion, at a CAGR of 20.01% between 2023 and 2028.The market is experiencing significant growth, driven by the increasing adoption of Next-Generation Sequencing (NGS) technologies. NGS offers several advantages over traditional Sanger sequencing, including faster turnaround time, lower costs, and the ability to sequence entire genomes. This technological advancement has led to a surge in demand for DNA sequencing in various applications, including diagnostics, research, and forensics technologies. However, the market faces challenges, most notably the emergence of third-generation sequencing methods. These new technologies, such as PacBio and Oxford Nanopore, offer even faster sequencing speeds and longer read lengths than NGS. As these methods continue to advance, they may disrupt the market dynamics and force companies to innovate or risk becoming obsolete.Additionally, inadequate resources for DNA sequencing in developing countries pose a significant challenge. Despite the potential benefits of DNA sequencing, many countries lack the necessary infrastructure and financial resources to implement these technologies. Companies that can address these challenges and provide affordable and accessible solutions will be well-positioned to capitalize on the growing demand for DNA sequencing.What will be the Size of the DNA Sequencing Market during the forecast period?Explore in-depth regional segment analysis with market size data - historical 2018-2022 and forecasts 2024-2028 - in the full report. Request Free SampleThe market continues to evolve, driven by advancements in technologies and applications across various sectors. Base calling, a fundamental process in sequencing, is being refined through the use of artificial intelligence and machine learning algorithms. Microbial sequencing, a key application, is revolutionizing fields such as metagenomics and environmental science. Precision medicine, another significant area, is benefiting from the integration of genomic data into clinical workflows, enabling personalized treatment plans. Nanopore sequencing, known for its long read length, is gaining traction in genome assembly and gene expression analysis. Variant calling, a crucial step in identifying genetic mutations, is being enhanced by the integration of multiple data sources and advanced algorithms.Sample preparation, a critical step in the sequencing process, is being optimized for improved efficiency and cost reduction. Sequencing depth, read length, and sequencing coverage are key performance indicators that continue to evolve, enabling the detection of rare variants and complex genomic structures. Rare disease research is a growing application area, with high-throughput sequencing and exome sequencing playing a pivotal role in identifying disease-causing mutations. Regulatory compliance, data security, and data storage are becoming increasingly important considerations in the market. Cost reduction and workflow optimization are ongoing priorities for sequencing platform providers, with next-generation sequencing (NGS) and Sanger sequencing continuing to coexist in the market.The ongoing advancements in DNA sequencing technologies and applications are shaping the market dynamics, with genome editing, clinical diagnostics, and forensic science being some of the emerging areas of focus. The integration of cloud computing and library preparation into the sequencing workflow is also transforming the market landscape. In the realm of research, applications such as phylogenetic analysis, methylation analysis, SNP analysis, and quality control are driving the adoption of sequencing technologies. The sequencing error rate, a critical performance metric, is being addressed through the development of advanced algorithms and sequencing reagents. Infectious disease research is another area of significant growth, with NGS playing a crucial role in identifying disease-causing pathogens and understanding their genetic makeup.Targeted sequencing and CNV analysis are also gaining popularity in this field, enabling the detection of specific genetic variants and chromosomal aberrations. The market is a dynamic and evolving landscape, with ongoing advancements in technologies and applications shaping its future direction. The integration of various components, including base calling, microbial sequencing, precision medicine, variant calling, nanopore sequencing, sample preparation, sequencing depth, read length, rare disease research, sequencing coverage, sequencing reagents, high-throughput sequencing, exome sequencing, regulatory compliance, mutation detection, illumina sequencing, CNV analysis, data storage, library p
r
High Throughput Genomic Sequences Division
rrid.site
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
High Throughput Genomic Sequences Division [Dataset]. http://identifiers.org/RRID:SCR_002150
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_002150
Description
Database of high-throughput genome sequences from large-scale genome sequencing centers, including unfinished and finished sequences. It was created to accommodate a growing need to make unfinished genomic sequence data rapidly available to the scientific community in a coordinated effort among the International Nucleotide Sequence databases, DDBJ, EMBL, and GenBank. Sequences are prepared for submission by using NCBI's software tools Sequin or tbl2asn. Each center has an FTP directory into which new or updated sequence files are placed. Sequence data in this division are available for BLAST homology searches against either the htgs database or the month database, which includes all new submissions for the prior month. Unfinished HTG sequences containing contigs greater than 2 kb are assigned an accession number and deposited in the HTG division. A typical HTG record might consist of all the first-pass sequence data generated from a single cosmid, BAC, YAC, or P1 clone, which together make up more than 2 kb and contain one or more gaps. A single accession number is assigned to this collection of sequences, and each record includes a clear indication of the status (phase 1 or 2) plus a prominent warning that the sequence data are unfinished and may contain errors. The accession number does not change as sequence records are updated; only the most recent version of a HTG record remains in GenBank.
d
Analysis of mitochondrial DNA sequence data from Myotis lucifugus and Myotis...
catalog.data.gov
Updated Jan 8, 2026
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2026). Analysis of mitochondrial DNA sequence data from Myotis lucifugus and Myotis occultus [Dataset]. https://catalog.data.gov/dataset/analysis-of-mitochondrial-dna-sequence-data-from-myotis-lucifugus-and-myotis-occultus-25b13
Explore at:
Dataset updated
Jan 8, 2026
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Description
The validity of Myotis occultus as a species unique from M. lucifugus has been a source of debate. Most recently, many authorities treat M. occultus as a distinct species, at least in part because a previous study showed that M. occultus and M. l. carissima (the subspecies that occurs in closest proximity to M. occultus) form separate monophyletic clades based on sequences of two mitochondrial genes (cytochrome-b [cytb] and cytochrome oxidase subunit II [COII]). We re-evaluated the phylogenetic relationship between M. occultus and M. lucifugus based on mitochondrial sequences using an expanded dataset of cytb and COII sequences that originated from more genetically diverse specimens of M. lucifugus collected across a broader geographic area. Based on a phylogenetic analysis, we found that M. occultus sublineages were embedded within a well-supported clade that included some specimens of M. lucifugus. These results indicate that the previous genetic analysis demonstrating that M. occultus and M. lucifugus form distinct monophyletic groups is unsupported by our larger dataset. Future genetic work involving whole genome sequencing of nuclear DNA will likely be needed to better resolve the true taxonomic relationship between M. occultus and M. lucifugus.
d
ZooGene - A DNA Sequence Database for Calanoid Copepods and Euphausiids
catalog.data.gov
Updated Jan 1, 2002
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
University of New Hampshire (Point of Contact) (2002). ZooGene - A DNA Sequence Database for Calanoid Copepods and Euphausiids [Dataset]. https://catalog.data.gov/km/dataset/zoogene-a-dna-sequence-database-for-calanoid-copepods-and-euphausiids
Explore at:
Dataset updated
Jan 1, 2002
Dataset provided by
University of New Hampshire (Point of Contact)
Description
An international partnership created a zooplankton genomic (ZooGene) database of DNA type sequences for calanoid copepods and euphausiids. The ZooGene database was designed to include all species of these groups and to allow expansion to additional zooplankton groups. The ZooGene partnership includes four P.I.s and thirteen expert taxonomic consultants from seven countries. Zooplankton samples are sorted from existing archival collections, obtained in coordination with planned oceanographic research efforts, and collected during National Marine Fisheries Service field surveys. The taxonomic experts confirm species' identifications; DNA sequencing is done at the University of New Hampshire and, in some cases, in other partners' laboratories. For each species, a DNA type sequence is determined for a portion of the mitochondrial cytochrome oxidase I (mtCOI) gene; multiple mtCOI sequences are included as necessary to reflect intraspecific variation. The ZooGene database is designed, created, managed, maintained, and distributed as part of the proposed work; the data is integrated into the Ocean Biogeographical Information System (OBIS).
u
Data from: Genome Database for Vaccinium
agdatacommons.nal.usda.gov
bin
Updated Nov 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dorrie S. Main; Sook Jung (2023). Genome Database for Vaccinium [Dataset]. https://agdatacommons.nal.usda.gov/articles/dataset/Genome_Database_for_Vaccinium/24661548
Explore at:
binAvailable download formats
Dataset updated
Nov 30, 2023
Dataset provided by
MainLab Bioinformatics, Washington State University
Authors
Dorrie S. Main; Sook Jung
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Description
The Genome Database for Vaccinium (GDV) is a curated and integrated web-based relational database. The GDV is being developed to house and integrate genomic, genetic and breeding data for blueberry, cranberry and other Vaccinium species. The GDV will include the blueberry genome being sequenced by North Carolina State University, and annotated transcripts, traits, maps and markers being generated by Vaccinium researchers. The GDV is implemented using Chado and Drupal (Tripal) and will include public and private sites to meet individual research group needs. The amount of genetic research data for Vaccinium is steadily increasing and there is a need for a system that can organize, filter and provide analysis of the available research to be directly applied in breeding programs. The idea of creating this database emerged from several group discussions, in which scientists, breeders, data curators, university professors and bioinformaticians started working on the publicly available genetic and genomic information to make it available for practical use in breeding programs. The GDV home page allows for quick access to species specific data and popular tools. Tools to view and compare genetic maps, BLAST tool for genomes and reference transcriptomes, and search interfaces to find and download marker, map, QTL, and sequence data are also included. Crop pages have quick links to data and tools in the sidebar, and MapViewer allows for dynamic visualization of genetic maps. Resources in this dataset:Resource Title: Genome Database for Vaccinium. File Name: Web Page, url: https://www.vaccinium.org/
Additional file 1: of Whatâ€™s in your next-generation sequence data? An...
springernature.figshare.com
xlsx
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lynsey Whitacre; Polyana Tizioto; JaeWoo Kim; Tad Sonstegard; Steven Schroeder; Leeson Alexander; Juan Medrano; Robert Schnabel; Jeremy Taylor; Jared Decker (2023). Additional file 1: of Whatâ€™s in your next-generation sequence data? An exploration of unmapped DNA and RNA sequence reads from the bovine reference individual [Dataset]. http://doi.org/10.6084/m9.figshare.c.3639158_D1.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.c.3639158_D1.v1
Dataset updated
May 31, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Lynsey Whitacre; Polyana Tizioto; JaeWoo Kim; Tad Sonstegard; Steven Schroeder; Leeson Alexander; Juan Medrano; Robert Schnabel; Jeremy Taylor; Jared Decker
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Table S1. Statistics from the de novo assembly of unmapped reads from DNA sequencing. Table S2. Statistics from the de novo assembly of unmapped reads from RNA sequencing. Table S3. Summary of all significant alignments from pairwise alignment of de novo assembled contigs from DNA unmapped reads to the nt database. Table S4. Number of significant alignments per tissue from pairwise alignment of de novo assembled contigs from unmapped RNA-seq reads to the nt database. Table S5. Summary of all significant alignments from pairwise alignment of de novo assembled contigs from unmapped RNA-seq reads to the nt database. Table S6. DNA sequencing metadata. Table S7. Genes represented in alignments of de novo assembled contigs from unmapped RNA-seq reads to Bos taurus. Table S8. Genes represented in alignments of de novo assembled contigs from unmapped RNA-seq reads to Bison bison bison, Bubalus bubalis, or Bos mutus. (XLSX 731Â kb)
FAIRsharing record for: National Omics Data Encyclopedia
search.datacite.org
Updated 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FAIRsharing Team (2018). FAIRsharing record for: National Omics Data Encyclopedia [Dataset]. http://doi.org/10.25504/fairsharing.eivzjw
Explore at:
Unique identifier
https://doi.org/10.25504/fairsharing.eivzjw
Dataset updated
2018
Dataset provided by
DataCite
FAIRsharing
Authors
FAIRsharing Team
Description
This FAIRsharing record describes: The National Omics Data Encyclopedia (NODE) is big data library with complete and integrative data storage, safe and efficiency-guaranteed data management as well as comprehensive and user-friendly data service functions. NODE stores raw sequence data from "next-generation" sequencing technologies including 454, IonTorrent, Illumina, SOLiD, Helicos and Complete Genomics. In addition to raw sequence data, NODE now stores alignment information in the form of read placements on a reference sequence.
d
Reference sequence database for eDNA metabarcoding of San Francisco estuary...
search.dataone.org
Updated Nov 29, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Raman Nagarajan; Ann Holmes; Andrea Schreier (2023). Reference sequence database for eDNA metabarcoding of San Francisco estuary fishes and invertebrates [Dataset]. http://doi.org/10.5061/dryad.0p2ngf25z
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.0p2ngf25z
Dataset updated
Nov 29, 2023
Dataset provided by
Dryad Digital Repository
Authors
Raman Nagarajan; Ann Holmes; Andrea Schreier
Time period covered
Jan 1, 2023
Description
Environmental DNA (eDNA) methods complement traditional monitoring and can be configured to detect multiple species simultaneously. One such approach, eDNA metabarcoding, uses high-throughput DNA sequencing to indirectly detect many different organisms, spanning broad taxonomic boundaries, from water samples. We are optimizing a non-invasive, low cost eDNA metabarcoding protocol to be used in conjunction with existing monitoring programs. One resource that is currently lacking for metabarcoding studies in general, including those in the San Francisco Estuary (SFE), is a comprehensive database of DNA barcode reference sequences. Without this foundational data, many species go undetected or misidentified in metabarcoding studies. To meet this need, we generated a custom barcode sequence database for the SFE by DNA sequencing and mining of public DNA seqeunce data for estuarine and freshwater species of interest to monitoring programs and ecological studies. Here we present custom referenc...
f
Data from: Sequencing Intractable DNA to Close Microbial Genomes
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Jul 31, 2012
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Podar, Mircea; Hurt Jr, Richard A.; Palumbo, Anthony V.; Brown, Steven D.; Elias, Dwayne A. (2012). Sequencing Intractable DNA to Close Microbial Genomes [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001160163
Explore at:
Dataset updated
Jul 31, 2012
Authors
Podar, Mircea; Hurt Jr, Richard A.; Palumbo, Anthony V.; Brown, Steven D.; Elias, Dwayne A.
Description
Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled “intractable” resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such problematic regions in the “non-contiguous finished” Desulfovibrio desulfuricans ND132 genome (6 intractable gaps) and the Desulfovibrio africanus genome (1 intractable gap). The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. The developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

Facebook

Twitter

Click to copy link

Link copied

Cite

(2024). GenBank [Dataset]. http://identifiers.org/RRID:SCR_002760

GenBank

RRID:SCR_002760, nif-0000-02873, r3d100010528, OMICS_01650, GenBank (RRID:SCR_002760), GB, Gen Bank, GenBank

Explore at:

53 scholarly articles cite this dataset (View in Google Scholar)

Unique identifier

https://identifiers.org/RRID:SCR_002760

Dataset updated

Sep 17, 2024

Description

NIH genetic sequence database that provides annotated collection of all publicly available DNA sequences for almost 280 000 formally described species (Jan 2014) .These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole-genome shotgun (WGS) and environmental sampling projects. Most submissions are made using web-based BankIt or standalone Sequin programs, and GenBank staff assigns accession numbers upon data receipt. It is part of International Nucleotide Sequence Database Collaboration and daily data exchange with European Nucleotide Archive (ENA) and DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through NCBI Entrez retrieval system, which integrates data from major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of GenBank database are available by FTP.

Clear search

Close search

Google apps

Main menu

GenBank

T4-like genome database

DNA sequencing raw data and analytical results by bioinformatics for column...

DNA sequencing data in MDS-patients treated with allogeneic transplantation

Statistics of the chloroplast genome sequencing data

Table_1_Cross-sectional use of barcode of life data system and GenBank as...

Whole genome sequencing data for spirit collection specimens

NEON (National Ecological Observatory Network) Fish sequences DNA barcode...

The tpm metabarcoding DNA sequence database for taxonomic allocations using...

High Throughput Genomic Sequences Division

DNA sequencing data for "Stable clonal contribution of lineage-restricted...

DNA Sequencing Market Growth Analysis - Size and Forecast 2024-2028 |...

High Throughput Genomic Sequences Division

Analysis of mitochondrial DNA sequence data from Myotis lucifugus and Myotis...

ZooGene - A DNA Sequence Database for Calanoid Copepods and Euphausiids

Data from: Genome Database for Vaccinium

Additional file 1: of Whatâ€™s in your next-generation sequence data? An...

FAIRsharing record for: National Omics Data Encyclopedia

Reference sequence database for eDNA metabarcoding of San Francisco estuary...

Data from: Sequencing Intractable DNA to Close Microbial Genomes

GenBank

RRID:SCR_002760, nif-0000-02873, r3d100010528, OMICS_01650, GenBank (RRID:SCR_002760), GB, Gen Bank, GenBank