54 datasets found

NCBI Trace Archive
integbio.jp
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NCBI (National Center for Biotechnology Information), NCBI Trace Archive [Dataset]. https://integbio.jp/dbcatalog/en/record/nbdc01944
Explore at:
Dataset provided by
National Center for Biotechnology Informationhttp://www.ncbi.nlm.nih.gov/
Description
The Trace Archives includes the following archives: The Sequence Read Archive (SRA) stores raw sequence data from "next-generation" sequencing technologies including 454, IonTorrent, Illumina, SOLiD, Helicos and Complete Genomics. In addition to raw sequence data, SRA now stores alignment information in the form of read placements on a reference sequence. The Trace Archive serves as the repository of sequencing data from gel/capillary platforms such as Applied Biosystems ABI 3730. The Trace Assembly Archive stores pairwise alignment and multiple alignment of sequencing reads, linking basic trace data with finished genomic sequence as found in GenBank.
e
Catalog of NCBI sequence read archive (SRA) data for salamanders at the...
portal.edirepository.org
csv
Updated Apr 9, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Catalog of NCBI sequence read archive (SRA) data for salamanders at the Hubbard Brook Experimental Forest 2012-2021 [Dataset]. https://portal.edirepository.org/nis/mapbrowse?scope=knb-lter-hbr&identifier=398
Explore at:
csv(220695 byte), csv(312227 byte), csv(282251 byte)Available download formats
Unique identifier
https://doi.org/10.6073/pasta/6df7199d751ec81315395a042cbd8083
Dataset updated
Apr 9, 2024
Dataset provided by
EDI
Authors
Brett Addis; Madaline Cochrane; Winsor Lowe
Time period covered
2012 - 2021
Area covered

Variables measured
strain, ecotype, isolate, lat_lon, cultivar, organism, Accession, BioProject, env_medium, sample_URL, and 8 more
Description
This project was designed to describe fine-scale population genetic differentiation of the stream salamander Gryinophilus porphyriticus among five study streams in the Hubbard Brook Experimental Forest. The data are paired with intensive capture-recapture data to assess direct fitness effects of individual genetic diversity, including effects of individual multilocus heterozygosity on stage-specific survival probabilities.

This dataset publishes a manifest of the genomic sequence reads submitted to the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA). These samples are published at NCBI under the BioProject ID 1090913 (https://www.ncbi.nlm.nih.gov/bioproject/1090913). The tables here include sample metadata and the NCBI URLs to each sample. These data were gathered as part of the Hubbard Brook Ecosystem Study (HBES). The HBES is a collaborative effort at the Hubbard Brook Experimental Forest, which is operated and maintained by the USDA Forest Service, Northern Research Station.
d
NCBI Sequence Read Archive (SRA) accession numbers for fastq sequence files...
dataone.org
search.dataone.org
Updated Mar 9, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Erica Goetze (2025). NCBI Sequence Read Archive (SRA) accession numbers for fastq sequence files for each zooplankton community sample (Plankton Population Genetics project) [Dataset]. http://doi.org/10.1575/1912/bco-dmo.704665
Explore at:
Unique identifier
https://doi.org/10.1575/1912/bco-dmo.704665
Dataset updated
Mar 9, 2025
Dataset provided by
Biological and Chemical Oceanography Data Management Office (BCO-DMO)
Authors
Erica Goetze
Time period covered
Jun 13, 2014 - Jun 19, 2014
Area covered

Description
These data include sample information and accession links to sequence data at The National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA).

This data submission consists of metabarcoding data for the zooplankton community in the epipelagic, mesopelagic and upper bathypelagic zones (0-1500m) of the North Pacific Subtropical Gyre. The goal of this study was to assess the hidden diversity present in zooplankton assemblages in midwaters, and detect vertical gradients in species richness, depth distributions, and community composition of the full zooplankton assemblage. Samples were collected in June 2014 from Station ALOHA (22.75, -158) using a 1 meter square Multiple Opening and Closing Nets and Environmental Sampling System (MOCNESS, 200um mesh), on R/V Falkor cruise FK140613. Next generation sequence data (Illumina MiSeq, V3 chemistry, 300-bp paired-end) of the zooplankton assemblage derive from amplicons of the V1-V2 region of 18S rRNA (primers described in Fonseca et al. 2010). The data includes sequences and read count abundance information for molecular OTUs from both holoplanktonic and meroplanktonic taxa

Related dataset containing OTU tables and fasta sequences (representative / most abundance read for each OTU):
Metabarcoding zooplankton at station ALOHA: OTU tables and fasta files
The OHEJP BeONE Project – Escherichia coli genome assembly dataset
zenodo.org
data.niaid.nih.gov
bin, zip
Updated Jul 24, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Verónica Mixão; Verónica Mixão; Miguel Pinto; Miguel Pinto; João Paulo Gomes; João Paulo Gomes; Daniel Sobral; Daniel Sobral; Holger Brendebach; Holger Brendebach; Carlus Deneke; Carlus Deneke; Simon Tausch; Simon Tausch; Adriano Di Pasquale; Adriano Di Pasquale; Claudia Swart-Coipan; Claudia Swart-Coipan; Ewelina Iwan; Jörg Linde; Jörg Linde; Karin Lagesen; Karin Lagesen; Liljana Petrovska; Liljana Petrovska; Mohammed Umaer Naseer; Rolf Sommer Kaas; Rolf Sommer Kaas; Sandra Simon; Katrine Joensen; Katrine Joensen; Kristoffer Kiil; Sofie Nielsen; Sofie Nielsen; Vítor Borges; Vítor Borges; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI; Ewelina Iwan; Mohammed Umaer Naseer; Sandra Simon; Kristoffer Kiil; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI (2023). The OHEJP BeONE Project – Escherichia coli genome assembly dataset [Dataset]. http://doi.org/10.5281/zenodo.7267845
Explore at:
zip, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7267845
Dataset updated
Jul 24, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Verónica Mixão; Verónica Mixão; Miguel Pinto; Miguel Pinto; João Paulo Gomes; João Paulo Gomes; Daniel Sobral; Daniel Sobral; Holger Brendebach; Holger Brendebach; Carlus Deneke; Carlus Deneke; Simon Tausch; Simon Tausch; Adriano Di Pasquale; Adriano Di Pasquale; Claudia Swart-Coipan; Claudia Swart-Coipan; Ewelina Iwan; Jörg Linde; Jörg Linde; Karin Lagesen; Karin Lagesen; Liljana Petrovska; Liljana Petrovska; Mohammed Umaer Naseer; Rolf Sommer Kaas; Rolf Sommer Kaas; Sandra Simon; Katrine Joensen; Katrine Joensen; Kristoffer Kiil; Sofie Nielsen; Sofie Nielsen; Vítor Borges; Vítor Borges; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI; Ewelina Iwan; Mohammed Umaer Naseer; Sandra Simon; Kristoffer Kiil; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset

This dataset comprises the genome assemblies of 308 Escherichia coli samples collected by the BeONE Consortium on behalf of the One Health European Joint Programme “BeONE: Building Integrative Tools for One Health Surveillance” (https://onehealthejp.eu/jrp-beone/). Additionally, a complementary dataset is also made available (https://zenodo.org/record/7120057), comprising genome assemblies of 1,999 E. coli samples selected among the Whole-Genome Sequencing (WGS) data publicly available in the European Nucleotide Archive (ENA) or in the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA).

File “BeONE_Ec_metadata.xlsx” contains the genome assembly statistics for each isolate, including European Nucleotide Archive accession numbers, in-silico Multi Locus Sequence Type and Serotype.

The archive “BeONE_Ec_assemblies.zip” contains all the genome assemblies (.fasta format) of each isolate presented in the metadata file.

Dataset selection and curation

This anonymized dataset of E. coli genome assemblies was generated using Next Generation Sequencing data collected within the BeONE Consortium available at the European Nucleotide Archive under BioProject Accession Number PRJEB57098. Read quality control, trimming and assembly were performed with Aquamis v1.3.9 (Deneke et al. 2021) using default parameters. Assembly quality control (QC), including contamination assessment, as well as MLST ST determination were performed with the same pipeline. All genome assemblies passing the QC were included in the final dataset. Among the others, we noticed that a considerable proportion of assemblies was flagged as “QC fail” exclusively due to the “NumContamSNVs” parameter, suggesting that this setting might have been too strict. After manual inspection of a random subset, assemblies for which the percentage of reads corresponding to the correct species was >98% were recovered and integrated in the final dataset (those samples are labeled in the Metadata file). In total, 308 isolates passed the dataset curation step and were included in the final dataset. In-silico serotyping was performed with seq_typing v2.2.

Funding

This work was supported by funding from the European Union’s Horizon 2020 Research and Innovation programme under grant agreement No 773830: One Health European Joint Programme.
d
Sample collection information and sequence accessions at the National Center...
search.dataone.org
bco-dmo.org
+1more
Updated Mar 9, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John J. Stachowicz (2025). Sample collection information and sequence accessions at the National Center for Biotechnology Information (NCBI) for whole genome sequencing of eelgrass (Zostera marina) collected at Bodega and Tomales Bay, CA, USA from July to September 2019 [Dataset]. https://search.dataone.org/view/sha256%3Ad19f54c6d0afe04071aea24c0b6e1cec4b1bc13161f061822b35b92a253fe865
Explore at:
Dataset updated
Mar 9, 2025
Dataset provided by
Biological and Chemical Oceanography Data Management Office (BCO-DMO)
Authors
John J. Stachowicz
Time period covered
Jul 16, 2019 - Sep 30, 2019
Area covered

Description
This dataset includes sample collection information and sequence accessions at the National Center for Biotechnology Information (NCBI) for whole genome sequencing of eelgrass (Zostera marina) collected at Bodega and Tomales Bay, California, USA from July and September of 2019. Sequence Read Archive (SRA) Experiments and BioSamples can be accessed from the NCBI BioProject PRJNA887384 (https://www.ncbi.nlm.nih.gov/bioproject/PRJNA887384/).

Results summary as described in Scheibelhut, et al. (2023): We examine genomic signals of selection in the eelgrass Zostera marina across temperature gradients in adjacent embayments. Although we find many genomic regions with signals of selection within each bay there is very little overlap in signals of selection at the SNP level, despite most polymorphisms being shared across bays. We do find overlap at the gene level, potentially suggesting multiple mutational pathways to the same phenotype. Using polygenic models we find that some sets of candidate SNPs are able to predict temperature across both bays, suggesting that small but parallel shifts in allele frequencies may be missed by independent genome scans. Together, these results highlight the continuous rather than binary nature of parallel evolution in polygenic traits and the complexity of evolutionary predictability.
The OHEJP BeONE Project – Listeria monocytogenes genome assembly dataset
zenodo.org
explore.openaire.eu
+2more
bin, zip
Updated Jul 24, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Verónica Mixão; Verónica Mixão; Miguel Pinto; Miguel Pinto; João Paulo Gomes; João Paulo Gomes; Daniel Sobral; Daniel Sobral; Holger Brendebach; Holger Brendebach; Carlus Deneke; Carlus Deneke; Simon Tausch; Simon Tausch; Adriano Di Pasquale; Adriano Di Pasquale; Claudia Swart-Coipan; Claudia Swart-Coipan; Ewelina Iwan; Jörg Linde; Jörg Linde; Karin Lagesen; Karin Lagesen; Liljana Petrovska; Liljana Petrovska; Mohammed Umaer Naseer; Rolf Sommer Kaas; Rolf Sommer Kaas; Sandra Simon; Katrine Joensen; Katrine Joensen; Kristoffer Kiil; Sofie Nielsen; Sofie Nielsen; Vítor Borges; Vítor Borges; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI; Ewelina Iwan; Mohammed Umaer Naseer; Sandra Simon; Kristoffer Kiil; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI (2023). The OHEJP BeONE Project – Listeria monocytogenes genome assembly dataset [Dataset]. http://doi.org/10.5281/zenodo.7267487
Explore at:
bin, zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7267487
Dataset updated
Jul 24, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Verónica Mixão; Verónica Mixão; Miguel Pinto; Miguel Pinto; João Paulo Gomes; João Paulo Gomes; Daniel Sobral; Daniel Sobral; Holger Brendebach; Holger Brendebach; Carlus Deneke; Carlus Deneke; Simon Tausch; Simon Tausch; Adriano Di Pasquale; Adriano Di Pasquale; Claudia Swart-Coipan; Claudia Swart-Coipan; Ewelina Iwan; Jörg Linde; Jörg Linde; Karin Lagesen; Karin Lagesen; Liljana Petrovska; Liljana Petrovska; Mohammed Umaer Naseer; Rolf Sommer Kaas; Rolf Sommer Kaas; Sandra Simon; Katrine Joensen; Katrine Joensen; Kristoffer Kiil; Sofie Nielsen; Sofie Nielsen; Vítor Borges; Vítor Borges; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI; Ewelina Iwan; Mohammed Umaer Naseer; Sandra Simon; Kristoffer Kiil; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset

This dataset comprises the genome assemblies of 1,426 Listeria monocytogenes samples collected by the BeONE Consortium on behalf of the One Health European Joint Programme “BeONE: Building Integrative Tools for One Health Surveillance” (https://onehealthejp.eu/jrp-beone/). Additionally, a complementary dataset is also made available (https://zenodo.org/record/7116878), comprising genome assemblies of 1,874 L. monocytogenes samples selected among the Whole-Genome Sequencing (WGS) data publicly available in the European Nucleotide Archive (ENA) or in the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA).

File “BeONE_Lm_metadata.xlsx” contains the genome assembly statistics for each isolate, including European Nucleotide Archive accession numbers and in-silico Multi Locus Sequence Type.

The archive “BeONE_Lm_assemblies.zip” contains all the genome assemblies (.fasta format) of each isolate presented in the metadata file.

Dataset selection and curation

This anonymized dataset of L. monocytogenes genome assemblies was generated using Next Generation Sequencing data collected within the BeONE Consortium available at the European Nucleotide Archive under BioProject Accession Number PRJEB57166. Read quality control, trimming and assembly were performed with Aquamis v1.3.9 (Deneke et al. 2021) using default parameters. Assembly quality control (QC), including contamination assessment, as well as MLST ST determination were performed with the same pipeline. All genome assemblies passing the QC were included in the final dataset. Among the others, we noticed that a considerable proportion of assemblies was flagged as “QC fail” exclusively due to the “NumContamSNVs” parameter, suggesting that this setting might have been too strict. After manual inspection of a random subset, assemblies for which the percentage of reads corresponding to the correct species was >98% were recovered and integrated in the final dataset (those samples are labeled in the Metadata file). In total, 1,426 isolates passed the dataset curation step and were included in the final dataset.

Funding

This work was supported by funding from the European Union’s Horizon 2020 Research and Innovation programme under grant agreement No 773830: One Health European Joint Programme.
d
Whole genome sequencing of three North American large-bodied birds
datasets.ai
data.usgs.gov
+2more
55
Updated Sep 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of the Interior (2024). Whole genome sequencing of three North American large-bodied birds [Dataset]. https://datasets.ai/datasets/whole-genome-sequencing-of-three-north-american-large-bodied-birds
Explore at:
55Available download formats
Dataset updated
Sep 11, 2024
Dataset authored and provided by
Department of the Interior
Description
The data release details the samples, methods, and raw data used to generate high-quality genome assemblies for greater sage-grouse (Centrocercus urophasianus), white-tailed ptarmigan (Lagopus leucura), and trumpeter swan (Cygnus buccinator). The raw data have been deposited in the Sequence Read Archive (SRA) of the National Center for Biotechnology Information (NCBI), the authoritative repository for public biological sequence data, and are not included in this data release. Instead, the accessions that link to those data via the NCBI portal (www.ncbi.nlm.nih.gov) are provided herein. The release consists of a single file, sample.metadata.txt, which maps NCBI accessions to the samples sequenced and the different types of sequencing performed to generate the assemblies and annotate their gene features.
The OHEJP BeONE Project – Campylobacter jejuni genome assembly dataset
zenodo.org
data.niaid.nih.gov
bin, zip
Updated Jul 24, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Verónica Mixão; Verónica Mixão; Miguel Pinto; Miguel Pinto; João Paulo Gomes; João Paulo Gomes; Daniel Sobral; Daniel Sobral; Holger Brendebach; Holger Brendebach; Carlus Deneke; Carlus Deneke; Simon Tausch; Simon Tausch; Adriano Di Pasquale; Adriano Di Pasquale; Claudia Swart-Coipan; Claudia Swart-Coipan; Ewelina Iwan; Jörg Linde; Jörg Linde; Karin Lagesen; Karin Lagesen; Liljana Petrovska; Liljana Petrovska; Mohammed Umaer Naseer; Rolf Sommer Kaas; Rolf Sommer Kaas; Sandra Simon; Katrine Joensen; Katrine Joensen; Kristoffer Kiil; Sofie Nielsen; Sofie Nielsen; Vítor Borges; Vítor Borges; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI; Ewelina Iwan; Mohammed Umaer Naseer; Sandra Simon; Kristoffer Kiil; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI (2023). The OHEJP BeONE Project – Campylobacter jejuni genome assembly dataset [Dataset]. http://doi.org/10.5281/zenodo.7802717
Explore at:
bin, zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7802717
Dataset updated
Jul 24, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Verónica Mixão; Verónica Mixão; Miguel Pinto; Miguel Pinto; João Paulo Gomes; João Paulo Gomes; Daniel Sobral; Daniel Sobral; Holger Brendebach; Holger Brendebach; Carlus Deneke; Carlus Deneke; Simon Tausch; Simon Tausch; Adriano Di Pasquale; Adriano Di Pasquale; Claudia Swart-Coipan; Claudia Swart-Coipan; Ewelina Iwan; Jörg Linde; Jörg Linde; Karin Lagesen; Karin Lagesen; Liljana Petrovska; Liljana Petrovska; Mohammed Umaer Naseer; Rolf Sommer Kaas; Rolf Sommer Kaas; Sandra Simon; Katrine Joensen; Katrine Joensen; Kristoffer Kiil; Sofie Nielsen; Sofie Nielsen; Vítor Borges; Vítor Borges; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI; Ewelina Iwan; Mohammed Umaer Naseer; Sandra Simon; Kristoffer Kiil; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset

This dataset comprises the genome assemblies of 610 Campylobacter jejuni samples collected by the BeONE Consortium on behalf of the One Health European Joint Programme “BeONE: Building Integrative Tools for One Health Surveillance” (https://onehealthejp.eu/jrp-beone/). Additionally, a complementary dataset is also made available (https://zenodo.org/record/7120166), comprising genome assemblies of 3,076 C. jejuni samples selected among the Whole-Genome Sequencing (WGS) data publicly available in the European Nucleotide Archive (ENA) or in the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA).

File “BeONE_Cj_metadata.xlsx” contains the genome assembly statistics for each isolate, including European Nucleotide Archive accession numbers and in-silico Multi Locus Sequence Type, and information regarding year of sampling, country and source.

The archive “BeONE_Cj_assemblies.zip” contains all the genome assemblies (.fasta format) of each isolate presented in the metadata file.

Dataset selection and curation

This anonymized dataset of C. jejuni genome assemblies was generated using Next Generation Sequencing data collected within the BeONE Consortium available at the European Nucleotide Archive under BioProject Accession Number PRJEB57119. Read quality control, trimming and assembly were performed with Aquamis v1.3.9 (Deneke et al. 2021) using default parameters. Assembly quality control (QC), including contamination assessment, as well as MLST ST determination were performed with the same pipeline. All genome assemblies passing the QC were included in the final dataset. Among the others, we noticed that a considerable proportion of assemblies was flagged as “QC fail” exclusively due to the “NumContamSNVs” parameter, suggesting that this setting might have been too strict. After manual inspection of a random subset, assemblies for which the percentage of reads corresponding to the correct species was >98% were recovered and integrated in the final dataset (those samples are labeled in the Metadata file). In total, 610 isolates passed the dataset curation step and were included in the final dataset.

Funding

This work was supported by funding from the European Union’s Horizon 2020 Research and Innovation programme under grant agreement No 773830: One Health European Joint Programme.

Acknowledgements

We thank the National Distributed Computing Infrastructure of Portugal (INCD) for providing the necessary resources to run the genome assemblies. INCD was funded by FCT and FEDER under the project 22153-01/SAICT/2016.
1000 Cannabis Genomes Project
kaggle.com
zip
Updated Feb 26, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Google BigQuery (2019). 1000 Cannabis Genomes Project [Dataset]. https://www.kaggle.com/bigquery/genomics-cannabis
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Feb 26, 2019
Dataset provided by
BigQueryhttps://cloud.google.com/bigquery
Authors
Google BigQuery
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

Cannabis is a genus of flowering plants in the family Cannabaceae.

Source: https://en.wikipedia.org/wiki/Cannabis

Content

In October 2016, Phylos Bioscience released a genomic open dataset of approximately 850 strains of Cannabis via the Open Cannabis Project. In combination with other genomics datasets made available by Courtagen Life Sciences, Michigan State University, NCBI, Sunrise Medicinal, University of Calgary, University of Toronto, and Yunnan Academy of Agricultural Sciences, the total amount of publicly available data exceeds 1,000 samples taken from nearly as many unique strains.

https://medium.com/google-cloud/dna-sequencing-of-1000-cannabis-strains-publicly-available-in-google-bigquery-a33430d63998

These data were retrieved from the National Center for Biotechnology Information’s Sequence Read Archive (NCBI SRA), processed using the BWA aligner and FreeBayes variant caller, indexed with the Google Genomics API, and exported to BigQuery for analysis. Data are available directly from Google Cloud Storage at gs://gcs-public-data--genomics/cannabis, as well as via the Google Genomics API as dataset ID 918853309083001239, and an additional duplicated subset of only transcriptome data as dataset ID 94241232795910911, as well as in the BigQuery dataset bigquery-public-data:genomics_cannabis.

All tables in the Cannabis Genomes Project dataset have a suffix like _201703. The suffix is referred to as [BUILD_DATE] in the descriptions below. The dataset is updated frequently as new releases become available.

The following tables are included in the Cannabis Genomes Project dataset:

Sample_info contains fields extracted for each SRA sample, including the SRA sample ID and other data that give indications about the type of sample. Sample types include: strain, library prep methods, and sequencing technology. See SRP008673 for an example of upstream sample data. SRP008673 is the University of Toronto sequencing of Cannabis Sativa subspecies Purple Kush.

MNPR01_reference_[BUILD_DATE] contains reference sequence names and lengths for the draft assembly of Cannabis Sativa subspecies Cannatonic produced by Phylos Bioscience. This table contains contig identifiers and their lengths.

MNPR01_[BUILD_DATE] contains variant calls for all included samples and types (genomic, transcriptomic) aligned to the MNPR01_reference_[BUILD_DATE] table. Samples can be found in the sample_info table. The MNPR01_[BUILD_DATE] table is exported using the Google Genomics BigQuery variants schema. This table is useful for general analysis of the Cannabis genome.

MNPR01_transcriptome_[BUILD_DATE] is similar to the MNPR01_[BUILD_DATE] table, but it includes only the subset transcriptomic samples. This table is useful for transcribed gene-level analysis of the Cannabis genome.

Fork this kernel to get started with this dataset.

Acknowledgements

Dataset Source: http://opencannabisproject.org/ Category: Genomics Use: This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - https://www.ncbi.nlm.nih.gov/home/about/policies.shtml - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset. Update frequency: As additional data are released to GenBank View in BigQuery: https://bigquery.cloud.google.com/dataset/bigquery-public-data:genomics_cannabis View in Google Cloud Storage: gs://gcs-public-data--genomics/cannabis

Banner Photo by Rick Proctor from Unplash.

Inspiration

Which Cannabis samples are included in the variants table?

Which contigs in the MNPR01_reference_[BUILD_DATE] table have the highest density of variants?

How many variants does each sample have at the THC Synthase gene (THCA1) locus?
The OHEJP BeONE Project – Salmonella enterica genome assembly dataset
zenodo.org
data.niaid.nih.gov
bin, zip
Updated Jul 24, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Verónica Mixão; Verónica Mixão; Miguel Pinto; Miguel Pinto; João Paulo Gomes; João Paulo Gomes; Daniel Sobral; Daniel Sobral; Holger Brendebach; Holger Brendebach; Carlus Deneke; Carlus Deneke; Simon Tausch; Simon Tausch; Adriano Di Pasquale; Adriano Di Pasquale; Claudia Swart-Coipan; Claudia Swart-Coipan; Ewelina Iwan; Jörg Linde; Jörg Linde; Karin Lagesen; Karin Lagesen; Liljana Petrovska; Liljana Petrovska; Mohammed Umaer Naseer; Rolf Sommer Kaas; Rolf Sommer Kaas; Sandra Simon; Katrine Joensen; Katrine Joensen; Kristoffer Kiil; Sofie Nielsen; Sofie Nielsen; Vítor Borges; Vítor Borges; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI; Ewelina Iwan; Mohammed Umaer Naseer; Sandra Simon; Kristoffer Kiil; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI (2023). The OHEJP BeONE Project – Salmonella enterica genome assembly dataset [Dataset]. http://doi.org/10.5281/zenodo.7802723
Explore at:
zip, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7802723
Dataset updated
Jul 24, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Verónica Mixão; Verónica Mixão; Miguel Pinto; Miguel Pinto; João Paulo Gomes; João Paulo Gomes; Daniel Sobral; Daniel Sobral; Holger Brendebach; Holger Brendebach; Carlus Deneke; Carlus Deneke; Simon Tausch; Simon Tausch; Adriano Di Pasquale; Adriano Di Pasquale; Claudia Swart-Coipan; Claudia Swart-Coipan; Ewelina Iwan; Jörg Linde; Jörg Linde; Karin Lagesen; Karin Lagesen; Liljana Petrovska; Liljana Petrovska; Mohammed Umaer Naseer; Rolf Sommer Kaas; Rolf Sommer Kaas; Sandra Simon; Katrine Joensen; Katrine Joensen; Kristoffer Kiil; Sofie Nielsen; Sofie Nielsen; Vítor Borges; Vítor Borges; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI; Ewelina Iwan; Mohammed Umaer Naseer; Sandra Simon; Kristoffer Kiil; INSA; APHA; BfR; DTU; FLI; IZSAM; NIPH; NVI; PIWET; RIVM; RKI; SSI
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset

This dataset comprises the genome assemblies of 1,540 Salmonella enterica samples collected by the BeONE Consortium on behalf of the One Health European Joint Programme “BeONE: Building Integrative Tools for One Health Surveillance” (https://onehealthejp.eu/jrp-beone/). Additionally, a complementary dataset is also made available (https://zenodo.org/record/7119735), comprising genome assemblies of 1,434 S. enterica samples selected among the Whole-Genome Sequencing (WGS) data publicly available in the European Nucleotide Archive (ENA) or in the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA).

File “BeONE_Se_metadata.xlsx” contains the genome assembly statistics for each isolate, including European Nucleotide Archive accession numbers, in-silico Multi Locus Sequence Type and Serotype, and information regarding year of sampling, country and source.

The archive “BeONE_Se_assemblies.zip” contains all the genome assemblies (.fasta format) of each isolate presented in the metadata file.

Dataset selection and curation

This anonymized dataset of S. enterica genome assemblies was generated using Next Generation Sequencing data collected within the BeONE Consortium available at the European Nucleotide Archive under BioProject Accession Number PRJEB57179. Read quality control, trimming and assembly were performed with Aquamis v1.3.9 (Deneke et al. 2021) using default parameters. Assembly quality control (QC), including contamination assessment, as well as MLST ST determination were performed with the same pipeline. All genome assemblies passing the QC were included in the final dataset. Among the others, we noticed that a considerable proportion of assemblies was flagged as “QC fail” exclusively due to the “NumContamSNVs” parameter, suggesting that this setting might have been too strict. After manual inspection of a random subset, assemblies for which the percentage of reads corresponding to the correct species was >98% were recovered and integrated in the final dataset (those samples are labeled in the Metadata file). In total, 1,540 isolates passed the dataset curation step and were included in the final dataset. In-silico serotyping was performed with SeqSero2 v1.2.1 (Zhang et al. 2019).

Funding

This work was supported by funding from the European Union’s Horizon 2020 Research and Innovation programme under grant agreement No 773830: One Health European Joint Programme.

Acknowledgements

We thank the National Distributed Computing Infrastructure of Portugal (INCD) for providing the necessary resources to run the genome assemblies. INCD was funded by FCT and FEDER under the project 22153-01/SAICT/2016.
d
Metabarcode sequencing of aquatic environmental DNA from the Potomac River...
catalog.data.gov
gimi9.com
Updated Jul 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). Metabarcode sequencing of aquatic environmental DNA from the Potomac River Watershed, 2015-2020 [Dataset]. https://catalog.data.gov/dataset/metabarcode-sequencing-of-aquatic-environmental-dna-from-the-potomac-river-watershed-2015-
Explore at:
Dataset updated
Jul 6, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
Potomac River
Description
Biological indicator taxa have long been used for integrative assessments of water quality, particularly benthic invertebrate groups such as arthropods. While standardized protocols have been developed to calculate 'biological index' scores based on the abundances of these taxa, such systems are challenging to implement at large scales due to the sampling effort required, taxonomic expertise needed, and the need for repeated sampling to reliably discriminate sites. Many of the same taxa detected by traditional surveys can also be detected by genetic analysis of environmental DNA (eDNA), potentially allowing for an alternative formulation of biological indexes that might be faster and more economical to produce. The current data were produced to evaluate eDNA-derived biological indexes at sites within the Potomac River watershed of the eastern United States, specifically within units of the National Park Service for which previous biological assessment data were available. This data release consists of five files: 1. sample.metadata.txt, which contains sampling metadata and identifiers linking to sample-derived sequence data that has been deposited in the Sequence Read Archive of the National Center for Biotechnology Information (NCBI). This database is authoritative and comprehensive for sharing high-throughput sequence data produced with public funds. All accessions listed in the file can be searched to retrieve sample and sequence information at www.ncbi.nlm.nih.gov. 2. cox1.references.fasta, which contains reference sequences of the cytochrome c oxidase 1gene of arthropods (typically abbreviated cox1 or COI), identified from regional checklists. The file is a text file in FASTA format. 3. mt16S.references.fasta, which contains reference sequences of the mitochondrial 16S ribosomal RNA (mt16S) gene of arthropods identified from regional checklists. The file is a text file in FASTA format. 4. first.stage.counts.txt, which is a tab-delimited table of counts of sequences that are attributed to each taxon from each sample for the first stage of the study. Whether the taxon attribution is from the mt16S or cox1 locus is also indicated. 5. second.stage.counts.txt, which is a tab-delimited table of counts of sequences that are attributed to each taxon from each sample for the second stage of the study. Whether the taxon attribution is from the mt16S or cox1 locus is also indicated.
m
NCBI accession metadata for 18S rRNA gene tag sequences from DNA and RNA...
darchive.mblwhoilibrary.org
bco-dmo.org
+1more
pdf, text/tsv, txt +2
Updated Jul 24, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sarah K Hu; David Caron (2019). NCBI accession metadata for 18S rRNA gene tag sequences from DNA and RNA from samples collected in coastal California in 2013 and 2014 [Dataset]. https://darchive.mblwhoilibrary.org/entities/publication/438f7d51-f9e5-5c8d-b797-b10f4b04156a
Explore at:
pdf, xml, text/tsv, zip, txtAvailable download formats
Dataset updated
Jul 24, 2019
Dataset provided by
Biological and Chemical Oceanography Data Management Office (BCO-DMO). Contact: bco-dmo-data@whoi.edu
Authors
Sarah K Hu; David Caron
Area covered

Description
NSF Division of Ocean Sciences (NSF OCE) OCE-1737409
c
Metagenomic detection and reconstruction of Lake Sinai Virus from honey bee...
s.cnmilf.com
data.usgs.gov
+1more
Updated Jul 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). Metagenomic detection and reconstruction of Lake Sinai Virus from honey bee sequence data [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/metagenomic-detection-and-reconstruction-of-lake-sinai-virus-from-honey-bee-sequence-data
Explore at:
Dataset updated
Jul 6, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
Lake Sinai Township
Description
A survey of public honey bee sequence data was performed to detect infections by Lake Sinai Virus (LSV). The Sequence Read Archive of the National Center for Biotechnology Information (NCBI) was queried to identify accessions of RNA sequence data derived from honey bee. These were filtered as described below and then up to 50 million reads or read pairs were downloaded and searched against a reference database of conserved LSV sequence. Accessions with matches above a specified threshold were downloaded in their entirety and assembled into longer contiguous sequences (contigs). The result contigs were searched against each open reading frame (ORF) of the reference LSV genome present in the NCBI database (accession NC_032433.1) and matching regions from each contig. These ORF sequences were aligned with additional sequences identified in NCBI databases through the BLAST web service. These alignments provide the basis for computing phylogenetic trees, rates of nucleotide substitution, codon usage bias, and other evolutionary parameters.
Z
Genome assemblies and respective wg/cgMLST profiles of a diverse dataset...
data.niaid.nih.gov
explore.openaire.eu
+1more
Updated Jul 24, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Simon Tausch (2023). Genome assemblies and respective wg/cgMLST profiles of a diverse dataset comprising 1,434 Salmonella enterica isolates [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_7119735
Explore at:
Dataset updated
Jul 24, 2023
Dataset provided by
Daniel Sobral
Verónica Mixão
Miguel Pinto
Carlus Deneke
Simon Tausch
Vítor Borges
João Paulo Gomes
Holger Brendebach
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset

This dataset comprises the genome assemblies and respective 8,558-loci whole-genome (wg) Multiple Locus Sequence Type (MLST) profiles [INNUENDO schema (Llarena et al. 2018) available in chewie-NS (Mamede et al. 2022)] of a final set of 1,434 Salmonella enterica samples selected among the Whole-Genome Sequencing (WGS) data publicly available in the European Nucleotide Archive (ENA) or in the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) at the beginning of the analysis (November 2021). This set of samples was carefully selected to cover a wide genetic diversity (assessed in terms of serotype). In total, 125 different serotypes are represented in this dataset, with Typhimurium (including monophasic), Enteritidis and Infantis being the most represented ones and, together, corresponding to 56.2% of the dataset.

File “Se_metadata.xlsx” contains metadata information for each isolate, including ENA/SRA accession number, BioProject and in-silico MLST ST and serotype.

The directory “assemblies/” contains all the genome assemblies (.fasta format) of each isolate presented in the metadata file.

The file “profiles/Se_profiles_wgMLST.tsv” corresponds to a tab separated file with the 8,558-loci wgMLST profiles of each isolate presented in the metadata file. The files “profiles/Se_profiles_cgMLST_95.tsv”, “profiles/Se_profiles_cgMLST_98.tsv” and “profiles/Se_profiles_cgMLST_100.tsv” correspond to a 3,261-loci, 3,179-loci and 874-loci cgMLST profiles of each isolate presented in the metadata file, respectively. These profiles were determined as explained below.

Dataset selection and curation

With the objective of creating a diverse dataset of S. enterica genome assemblies, we collected information about the genetic diversity (serotype) of the isolates available at Enterobase database in the beginning of this analysis (November 2021) and in other previous works. Based on this information, we selected an initial dataset comprising 1,779 samples associated with four BioProjects (PRJEB16326, PRJEB20997, PRJEB30335 and PRJEB39988). Their WGS data was downloaded from ENA/SRA with fastq-dl v1.0.6. Read quality control, trimming and assembly were performed with the Aquamis v1.3.9 (Deneke et al. 2021) using default parameters. Assembly quality control (QC), including contamination assessment, as well as MLST ST determination were performed with the same pipeline. All genome assemblies passing the QC were included in the final dataset. Among the others, we noticed that a considerable proportion of assemblies was flagged as “QC fail” exclusively due to the “NumContamSNVs” parameter, suggesting that this setting might have been too strict. After manual inspection of a random subset, assemblies for which the percentage of reads corresponding to the correct species was >98% were recovered and integrated in the final dataset (those samples are labeled in the Metadata file). In total, 1,434 isolates passed this curation step and were included in the final dataset. In-silico serotyping was performed with SeqSero2 v1.2.1 (Zhang et al. 2019). wgMLST profiles of each of these isolates were determined with chewBBACA v2.8.5 (Silva et al. 2018), using the 8,558-loci INNUENDO schema available in chewie-NS (Llarena et al. 2018; Mamede et al. 2022) and downloaded on May 31st, 2022. Three cgMLST schemas were obtained with ReporTree v1.0.0 (Mixão et al. 2022) using the 8,558-loci wgMLST profiles of the 1,434 isolates as input and setting distinct “--site-inclusion” thresholds: 0.95, 0.98 and 1.0 (i.e., keep schema loci called in at least 95%, 98% and 100% of the samples, resulting in a 3,261-loci, 3,179-loci and 874-loci allelic matrices, respectively).

Acknowledgements

We thank the National Distributed Computing Infrastructure of Portugal (INCD) for providing the necessary resources to run the genome assemblies. INCD was funded by FCT and FEDER under the project 22153-01/SAICT/2016.
d
Data relating to RNA sequence accessions at NCBI from Ross Sea...
search.dataone.org
bco-dmo.org
Updated Dec 5, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rebecca J. Gast (2021). Data relating to RNA sequence accessions at NCBI from Ross Sea Dinoflagellates, Phaeocystis antarctica, Pyramimons tychotreta, and Micromonas polaris (CCMP 2099) (Kleptoplasty project) [Dataset]. https://search.dataone.org/view/http%3A%2F%2Flod.bco-dmo.org%2Fid%2Fdataset%2F728427
Explore at:
Dataset updated
Dec 5, 2021
Dataset provided by
Biological and Chemical Oceanography Data Management Office (BCO-DMO)
Authors
Rebecca J. Gast
Time period covered
Dec 1, 1997 - Apr 7, 1998
Area covered
South Pacific Ocean, Pacific Ocean
Description
This dataset contains data related to RNA sequence genetic accessions at the National Center for Biotechnology Information (NCBI) including information about the host organism, collection location, and collection date.

The accessions are the unprocessed Illumina MiSeq reads for the Ross Sea Dinoflagellate RNA-Seq experiments, Phaeocystis antarctica RNA-Seq experiments, and Pyramimons tychotreta & Micromonas polaris (CCMP 2099) mixotrophy experiments.

Pyramimonas tychotreta & Micromonas polaris (CCMP 2099) mixotrophy RNA sequences are available through the NCBI Sequence Read Archive (SRA) under the SRA accession number SRP090401 (BioProject PRJNA342459)

Ross Sea Dinoflagellate RNA sequences are available through the NCBI Sequence Read Archive (SRA) under the accession number SRP132912 (BioProject PRJNA428208).

Phaeocystis antarctica RNA sequences are available through the NCBI Sequence Read Archive (SRA) under the accession number SRP133243 (BioProject PRJNA434497).
Z
Genome assemblies and respective wg/cgMLST profiles of a diverse dataset...
data.niaid.nih.gov
zenodo.org
Updated Jul 24, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Simon Tausch (2023). Genome assemblies and respective wg/cgMLST profiles of a diverse dataset comprising 1,999 Escherichia coli isolates [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7120057
Explore at:
Dataset updated
Jul 24, 2023
Dataset provided by
Daniel Sobral
Verónica Mixão
Miguel Pinto
Carlus Deneke
Simon Tausch
Vítor Borges
João Paulo Gomes
Holger Brendebach
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset

This dataset comprises the genome assemblies and respective 7,601-loci whole-genome (wg) Multiple Locus Sequence Type (MLST) profiles [INNUENDO schema (Llarena et al. 2018) available in chewie-NS (Mamede et al. 2022)] of a final set of 1,999 Escherichia coli samples selected among the Whole-Genome Sequencing (WGS) data publicly available in the European Nucleotide Archive (ENA) or in the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) at the beginning of the analysis (November 2021). This set of samples was carefully selected to cover a wide genetic diversity (assessed in terms of serotype). In total, 411 different serotypes are represented in this dataset, with O157:H7 being the most represented one, corresponding to 37.1% of the dataset.

File “Ec_metadata.xlsx” contains metadata information for each isolate, including ENA/SRA accession number, BioProject and in-silico MLST ST and serotype.

The directory “assemblies/” contains all the genome assemblies (.fasta format) of each isolate presented in the metadata file.

The file “profiles/Ec_profiles_wgMLST.tsv” corresponds to a tab separated file with the 7,601-loci wgMLST profiles of each isolate presented in the metadata file. The files “profiles/Ec_profiles_cgMLST_95.tsv”, “profiles/Ec_profiles_cgMLST_98.tsv” and “profiles/Ec_profiles_cgMLST_100.tsv” correspond to a 2,826-loci, 2,704-loci and 465-loci cgMLST profiles of each isolate presented in the metadata file, respectively. These profiles were determined as explained below.

Dataset selection and curation

With the objective of creating a diverse dataset of E. coli genome assemblies, we collected information about the genetic diversity (serotype) of the isolates available at Enterobase database in the beginning of this analysis (November 2021) and in other previous works. Based on this information, we selected an initial dataset comprising 2,688 samples associated with three BioProjects (PRJNA230969, PRJEB27020 and PRJNA248042). Their WGS data was downloaded from ENA/SRA with fastq-dl v1.0.6. Read quality control, trimming and assembly were performed with the Aquamis v1.3.9 (Deneke et al. 2021) using default parameters. Assembly quality control (QC), including contamination assessment, as well as MLST ST determination were performed with the same pipeline. All genome assemblies passing the QC were included in the final dataset. Among the others, we noticed that a considerable proportion of assemblies was flagged as “QC fail” exclusively due to the “NumContamSNVs” parameter, suggesting that this setting might have been too strict. After manual inspection of a random subset, assemblies for which the percentage of reads corresponding to the correct species was >98% were recovered and integrated in the final dataset (those samples are labeled in the Metadata file). In total, 1,999 isolates passed this curation step and were included in the final dataset. In-silico serotyping was performed with seq_typing v2.2. wgMLST profiles of each of these isolates were determined with chewBBACA v2.8.5 (Silva et al. 2018), using the 7,601-loci INNUENDO schema available in chewie-NS (Llarena et al. 2018; Mamede et al. 2022) and downloaded on May 31st, 2022. Three cgMLST schemas were obtained with ReporTree v1.0.0 (Mixão et al. 2022) using the 7,601-loci wgMLST profiles of the 1,999 isolates as input and setting distinct “--site-inclusion” thresholds: 0.95, 0.98 and 1.0 (i.e., keep schema loci called in at least 95%, 98% and 100% of the samples, resulting in a 2,826-loci, 2,704-loci and 465-loci allelic matrices, respectively).
d
Coral gene expression Sequence Read Archive (SRA) accession numbers and...
search.dataone.org
darchive.mblwhoilibrary.org
+1more
Updated Mar 9, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sarah W. Davies (2025). Coral gene expression Sequence Read Archive (SRA) accession numbers and information for samples collected at the Flower Garden Banks National Marine Sanctuary in the Gulf of Mexico in September and October of 2017 to capture effects of Hurricane Harvey [Dataset]. https://search.dataone.org/view/sha256%3Adc277481f592102ecdb5e6c56d30acd633733ed8971fc5e9b1481c81551a9bab
Explore at:
Dataset updated
Mar 9, 2025
Dataset provided by
Biological and Chemical Oceanography Data Management Office (BCO-DMO)
Authors
Sarah W. Davies
Time period covered
Sep 1, 2017 - Oct 1, 2017
Area covered

Description
To capture the immediate effects of storm-driven freshwater runoff on coral and symbiont physiology, we leveraged the heavy rainfall associated with Hurricane Harvey in late August 2017 by sampling FGB coral gene expression at two time points: September 2017, when surface water salinity was reduced (∼34 ppt); and 1 month later when salinity had returned to typical levels (∼36 ppt in October 2017).

This dataset includes Sequence Read Archive (SRA) and BioSample accessions under BioProject PRJNA552981 at The National Center for Biotechnology Information. It also contains sample information and species names for samples collected the east and west banks of the Flower Garden Banks National Marine Sanctuary (FGBNMS) at 80ft.

These data were published in Wright et al. (2019).
c
Reduced representation sequencing and genotyping of Arizona Toads (Anaxyrus...
s.cnmilf.com
data.usgs.gov
+1more
Updated Jul 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). Reduced representation sequencing and genotyping of Arizona Toads (Anaxyrus microscaphus) from the southwestern United States [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/reduced-representation-sequencing-and-genotyping-of-arizona-toads-anaxyrus-microscaphus-fr
Explore at:
Dataset updated
Jul 6, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
Southwestern United States, United States
Description
The dataset consists of genotypes (diploid base calls at variant sites) at 3,601 anonymous sites of the Arizona Toad (Anaxyrus microscaphus) nuclear genome. The genotyped samples are representative of the range of the species and its major population units, and the genotyped loci have a high degree of completeness. This data release consists of several files: 1. sample.metadata.txt, which contains sampling metadata and identifiers linking to sequence data that has been deposited in the Sequence Read Archive of the National Center for Biotechnology Information (NCBI). This database is authoritative and comprehensive for sharing high-throughput sequence data produced with public funds. All NCBI-derived accessions listed in the file can be searched at www.ncbi.nlm.nih.gov to retrieve sample and sequence information, as can the umbrella BioProject accession PRJNA995169 2. genotypes.genepop.txt, which contains inferred genetic variants in a common and convertible text-based format
m
Sample information and genetic accession information for raw low-coverage...
darchive.mblwhoilibrary.org
search.dataone.org
+1more
csv, pdf, xml
Updated Jan 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Biological and Chemical Oceanography Data Management Office (BCO-DMO). Contact: bco-dmo-data@whoi.edu (2024). Sample information and genetic accession information for raw low-coverage genomic sequence reads from 248 different Atlantic silverside (Menidia menidia) collected along the east coast of North America between 2005 to 2007 [Dataset]. https://darchive.mblwhoilibrary.org/entities/publication/da6c3e96-3253-41ec-86d2-1ebfb6e24031
Explore at:
pdf, xml, csvAvailable download formats
Dataset updated
Jan 2, 2024
Dataset provided by
Biological and Chemical Oceanography Data Management Office (BCO-DMO). Contact: bco-dmo-data@whoi.edu
Area covered

Description
Dataset: Raw low-coverage whole genome sequencing reads
Data from: Raptor roosts as invasion archives: insights from the first black...
agdatacommons.nal.usda.gov
bin
Updated Mar 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
University of Puerto Rico Mayaguez (2025). Raptor roosts as invasion archives: insights from the first black rat mitochondrial genome sequenced from the Caribbean [Dataset]. https://agdatacommons.nal.usda.gov/articles/dataset/Raptor_roosts_as_invasion_archives_insights_from_the_first_black_rat_mitochondrial_genome_sequenced_from_the_Caribbean/25089047/1
Explore at:
binAvailable download formats
Dataset updated
Mar 12, 2025
Dataset provided by
National Center for Biotechnology Informationhttp://www.ncbi.nlm.nih.gov/
Authors
University of Puerto Rico Mayaguez
License
https://rightsstatements.org/vocab/UND/1.0/https://rightsstatements.org/vocab/UND/1.0/
Area covered
Caribbean
Description
Raptor roosts as invasion archives: insights from the first black rat mitochondrial genome sequenced from the Caribbean

Facebook

Twitter

Click to copy link

Link copied

Cite

NCBI (National Center for Biotechnology Information), NCBI Trace Archive [Dataset]. https://integbio.jp/dbcatalog/en/record/nbdc01944

NCBI Trace Archive

Explore at:

Dataset provided by

National Center for Biotechnology Informationhttp://www.ncbi.nlm.nih.gov/

Description

The Trace Archives includes the following archives: The Sequence Read Archive (SRA) stores raw sequence data from "next-generation" sequencing technologies including 454, IonTorrent, Illumina, SOLiD, Helicos and Complete Genomics. In addition to raw sequence data, SRA now stores alignment information in the form of read placements on a reference sequence. The Trace Archive serves as the repository of sequencing data from gel/capillary platforms such as Applied Biosystems ABI 3730. The Trace Assembly Archive stores pairwise alignment and multiple alignment of sequencing reads, linking basic trace data with finished genomic sequence as found in GenBank.

Clear search

Close search

Google apps

Main menu

NCBI Trace Archive

Catalog of NCBI sequence read archive (SRA) data for salamanders at the...

NCBI Sequence Read Archive (SRA) accession numbers for fastq sequence files...

The OHEJP BeONE Project – Escherichia coli genome assembly dataset

Sample collection information and sequence accessions at the National Center...

The OHEJP BeONE Project – Listeria monocytogenes genome assembly dataset

Whole genome sequencing of three North American large-bodied birds

The OHEJP BeONE Project – Campylobacter jejuni genome assembly dataset

1000 Cannabis Genomes Project

Context

Content

Acknowledgements

Inspiration

The OHEJP BeONE Project – Salmonella enterica genome assembly dataset

Metabarcode sequencing of aquatic environmental DNA from the Potomac River...

NCBI accession metadata for 18S rRNA gene tag sequences from DNA and RNA...

Metagenomic detection and reconstruction of Lake Sinai Virus from honey bee...

Genome assemblies and respective wg/cgMLST profiles of a diverse dataset...

Data relating to RNA sequence accessions at NCBI from Ross Sea...

Genome assemblies and respective wg/cgMLST profiles of a diverse dataset...

Coral gene expression Sequence Read Archive (SRA) accession numbers and...

Reduced representation sequencing and genotyping of Arizona Toads (Anaxyrus...

Sample information and genetic accession information for raw low-coverage...

Data from: Raptor roosts as invasion archives: insights from the first black...

NCBI Trace Archive