100+ datasets found
  1. Whole-genome sequencing raw data

    • figshare.com
    txt
    Updated Nov 20, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ricardo Ramiro (2019). Whole-genome sequencing raw data [Dataset]. http://doi.org/10.6084/m9.figshare.10048712.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Nov 20, 2019
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Ricardo Ramiro
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Whole genome sequencing raw data for the manuscript "Low Mutational Load Allows for High Mutation Rate Variation in Gut Commensal Bacteria" (https://doi.org/10.1101/568709)All fastq files generated for this project are made available in FigShare and are deposited in the NCBI short-read archive: https://www.ncbi.nlm.nih.gov/bioproject/PRJNA580181/The first three codes in a file name, separated by underscores, indicate the identity of the sequenced clone, which is further described in the file clone_description.csv.Some clones have two sets of R1 and R2 files, as these were sequenced twice to achieve the required coverage.These data will be deposited in the NCBI short-read archive.

  2. d

    Data from: Clinical Genomic Database

    • dknet.org
    • scicrunch.org
    • +2more
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Clinical Genomic Database [Dataset]. http://identifiers.org/RRID:SCR_006427
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    Manually curated database of all conditions with known genetic causes, focusing on medically significant genetic data with available interventions. Includes gene symbol, conditions, allelic conditions, inheritance, age in which interventions are indicated, clinical categorization, and general description of interventions/rationale. Contents are intended to describe types of interventions that might be considered. Includes only single gene alterations and does not include genetic associations or susceptibility factors related to more complex diseases.

  3. E

    Whole-exome and whole-genome sequencing data

    • ega-archive.org
    Updated Jun 12, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2019). Whole-exome and whole-genome sequencing data [Dataset]. https://ega-archive.org/datasets/EGAD00001005087
    Explore at:
    Dataset updated
    Jun 12, 2019
    License

    https://ega-archive.org/dacs/EGAC00001001233https://ega-archive.org/dacs/EGAC00001001233

    Description

    Multi-omic data for lung neuroendocrine neoplasms, including the first multi-omic sequencing data for the understudied lung atypical carcinoids. The data includes Whole-exomes, whole-genomes, RNA-seq, and EPIC 850K methylation array data.

  4. U

    Whole genome sequencing of three North American large-bodied birds

    • data.usgs.gov
    • datasets.ai
    • +1more
    Updated Dec 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Robert Cornman; Jennifer Fike; Sara Oyler-McCance (2023). Whole genome sequencing of three North American large-bodied birds [Dataset]. http://doi.org/10.5066/P9DK14PM
    Explore at:
    Dataset updated
    Dec 13, 2023
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Authors
    Robert Cornman; Jennifer Fike; Sara Oyler-McCance
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Time period covered
    Jul 15, 2021
    Description

    The data release details the samples, methods, and raw data used to generate high-quality genome assemblies for greater sage-grouse (Centrocercus urophasianus), white-tailed ptarmigan (Lagopus leucura), and trumpeter swan (Cygnus buccinator). The raw data have been deposited in the Sequence Read Archive (SRA) of the National Center for Biotechnology Information (NCBI), the authoritative repository for public biological sequence data, and are not included in this data release. Instead, the accessions that link to those data via the NCBI portal (www.ncbi.nlm.nih.gov) are provided herein. The release consists of a single file, sample.metadata.txt, which maps NCBI accessions to the samples sequenced and the different types of sequencing performed to generate the assemblies and annotate their gene features.

  5. Whole genome sequencing data from maize landraces

    • figshare.com
    Updated Aug 28, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Runcie (2020). Whole genome sequencing data from maize landraces [Dataset]. http://doi.org/10.6084/m9.figshare.12890768.v1
    Explore at:
    Dataset updated
    Aug 28, 2020
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Daniel Runcie
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Low-coverage (~1x) whole genome sequence data of 108 maize landraces from Mexico or South America. Landraces were sampled according to a latitudinally stratification strategy with pairs of highland and lowland landraces selected from the same 0.5 degree latitudinal band. Paired-end 150bp reads were generated on the Illumina HiSeq X 10. Raw reads are available here: /iplant/home/deruncie/HiLo/WGS

  6. Z

    Data from: Aggregated variant data from whole-genome sequenced tinnitus...

    • data.niaid.nih.gov
    • produccioncientifica.ugr.es
    • +1more
    Updated Nov 9, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Robles-Bolivar, Paula (2022). Aggregated variant data from whole-genome sequenced tinnitus patients (TIGER) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7304955
    Explore at:
    Dataset updated
    Nov 9, 2022
    Dataset provided by
    Frejo, Lidia
    Cederroth, Christopher R
    Gallus, Silvano
    Robles-Bolivar, Paula
    Canlon, Barbara
    Lopez-Escamez, Jose Antonio
    Perez-Carpena, Patricia
    Trpchevska, Natalia
    Gallego-Martinez, Alvaro
    Roman-Naranjo, Pablo
    Bulla, Jan
    Escalera-Balsera, Alba
    Description

    Aggregated variant data obtained from tinnitus patients from Sweden.

    Uploaded datasets are storage in annotated csv files. Annotation was performed using VEP (v106), including population frequencies for each variant from gnomAD, non-finnish Europeans from gnomAD, and swedish population from SweGen project. Pathogenicity scores from CADD are also annotated for each variant. Variants from genes found to be enriched in a gene burden analysis can be found in this aggregated dataset.

    agg.tiger.csv - TIGER cohort is composed by 97 swedish whole-genome sequenced constant tinnitus patients.

    agg.jaguar.csv - JAGUAR cohort is composed by 147 swedish whole-exome sequenced tinnitus patients .

    agg.sevtin.csv - SEVTIN cohort is a subcohort from TIGER, with 34 WGS patients seggregating severe tinnitus phenotype.

    agg.controls.csv - Controls is a swedish population cohort composed by 151 whole-exome sequenced swedish individuals.

  7. The results of whole genome sequence database (the TrueBacTM ID-Genome...

    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The results of whole genome sequence database (the TrueBacTM ID-Genome system) matching for the novel Cupriavidus species. [Dataset]. https://plos.figshare.com/articles/dataset/The_results_of_whole_genome_sequence_database_the_TrueBac_sup_TM_sup_ID-Genome_system_matching_for_the_novel_i_Cupriavidus_i_species_/12297467
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Oh Joo Kweon; Yong Kwan Lim; Hye Ryoun Kim; Tae-Hyoung Kim; Sung-min Ha; Mi-Kyung Lee
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The results of whole genome sequence database (the TrueBacTM ID-Genome system) matching for the novel Cupriavidus species.

  8. d

    Data from: Whole-genome sequence data and analysis of a Staphylococcus...

    • catalog.data.gov
    • agdatacommons.nal.usda.gov
    Updated Apr 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). Data from: Whole-genome sequence data and analysis of a Staphylococcus aureus strain SJTUF_J27 isolated from seaweed [Dataset]. https://catalog.data.gov/dataset/data-from-whole-genome-sequence-data-and-analysis-of-a-staphylococcus-aureus-strain-sjtuf--5d2cc
    Explore at:
    Dataset updated
    Apr 21, 2025
    Dataset provided by
    Agricultural Research Service
    Description

    The complete genome sequence data of S. aureus SJTUF_J27 isolated from seaweed in China is reported here. The size of the genome is 2.8 Mbp with 32.9% G+C content, consisting of 2614 coding sequences and 77 RNAs. A number of virulence factors, including antimicrobial resistance genes (fluoroquinolone, beta-lactams, fosfomycin, mupirocin, trimethoprim, and aminocoumarin) and the egc enterotoxin cluster, were found in the genome. In addition, the genes encoding metal-binding proteins and associated heavy metal resistance were identified. Phylogenetic data analysis, based upon genome-wide single nucleotide polymorphisms (SNPs), and comparative genomic evaluation with BLAST Ring Image Generator (BRIG) were performed for SJTUF_J27 and four S. aureus strains isolated from food. The completed genome data was deposited in NCBI's GenBank under the accession number CP019117, https://www.ncbi.nlm.nih.gov/nuccore/CP019117. Resources in this dataset:Resource Title: NCBI GenBank Accession CP019117.1: Staphylococcus aureus strain SJTUF_J27 chromosome, complete genome. File Name: Web Page, url: https://www.ncbi.nlm.nih.gov/nuccore/CP019117 With an average of 331-fold sequencing coverage, a genome size of 2,804,759 bp constituting 32.9% of G+C content was generated. RAST annotation of the genome revealed a total of 399 subsystems, 2614 coding sequences (80 of them related to virulence, disease and defense), and 77 RNAs. PathogenFinder showed the probability of this strain being a human pathogen was 98%. Bacteria and source DNA available from Xianming Shi, 800 Dongchuan Road, Shanghai, China, 200240. Annotation was added by the NCBI Prokaryotic Genome Annotation Pipeline (released 2013).

  9. G

    Genomic Data Analysis Service Report

    • archivemarketresearch.com
    doc, pdf, ppt
    Updated Mar 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Archive Market Research (2025). Genomic Data Analysis Service Report [Dataset]. https://www.archivemarketresearch.com/reports/genomic-data-analysis-service-55611
    Explore at:
    pdf, ppt, docAvailable download formats
    Dataset updated
    Mar 10, 2025
    Dataset authored and provided by
    Archive Market Research
    License

    https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The global genomic data analysis service market is experiencing robust growth, projected to reach $1769.8 million in 2025 and exhibiting a Compound Annual Growth Rate (CAGR) of 13.1% from 2025 to 2033. This expansion is fueled by several key factors. Advances in next-generation sequencing (NGS) technologies are generating massive datasets, driving the demand for sophisticated analytical tools and services. The increasing affordability of genomic sequencing, coupled with expanding applications across healthcare, agriculture, and environmental research, further contributes to market growth. Specifically, the human application segment dominates, driven by personalized medicine initiatives and the growing understanding of the role of genomics in disease diagnosis and treatment. The rise of cloud-based data analysis platforms, offering scalability and cost-effectiveness, is also a significant driver. While data security and privacy concerns present a challenge, the development of robust data management and security protocols is mitigating this risk. Furthermore, the growing adoption of AI and machine learning in genomic data analysis enhances accuracy and efficiency, accelerating market growth. Segmentation within the market reveals strong performance across various application areas. Whole Genome Sequence Analysis and Whole Exome Sequence Analysis segments are major contributors, reflecting the comprehensive nature of the data generated and the insights derived. Geographically, North America currently holds a significant market share, driven by early adoption of advanced technologies and strong funding for research and development. However, Asia Pacific is anticipated to experience rapid growth, fueled by increasing investments in healthcare infrastructure and expanding genomics research activities in countries like China and India. Competition is intense, with established players like Illumina and QIAGEN alongside emerging companies offering specialized solutions. The continuous innovation in sequencing technologies and analytical methods ensures the ongoing evolution and expansion of this dynamic market.

  10. d

    Data from: Demographic history and inbreeding in two declining sea duck...

    • datadryad.org
    • search.dataone.org
    • +1more
    zip
    Updated Jul 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    María Ignacia Cádiz; Aja Noersgaard Buur Tengstedt; Iben Hove Sørensen; Emma Skindbjerg Pedersen; Anthony David Fox; Michael Møller Hansen (2024). Demographic history and inbreeding in two declining sea duck species inferred from whole genome sequence data [Dataset]. http://doi.org/10.5061/dryad.w3r22810z
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 20, 2024
    Dataset provided by
    Dryad
    Authors
    María Ignacia Cádiz; Aja Noersgaard Buur Tengstedt; Iben Hove Sørensen; Emma Skindbjerg Pedersen; Anthony David Fox; Michael Møller Hansen
    Description

    Data from: Demographic history and inbreeding in two declining sea duck species inferred from whole genome sequence data

    https://doi.org/10.5061/dryad.w3r22810z

    Description of the data and file structure

    The data encompasses multiple VCF files generated from whole-genome resequencing of two sea duck species, the velvet scoter (Melanitta fusca) and long-tailed duck (Clangula hyemalis).

    1A) LTD_raw.vcf.gz Raw variant calls from WGS data from long-tailed duck.

    2A) LTD.filtered.max2.QUAL30.minDP10.minGQ15.miss0.9.variant.sansSexChr.HWE.vcf Variant calls from file 1A) filtered based on quality, depth, missing data and HWE. Contains only bi-allelic SNPs of QUAL>30, only genotypes with read depth >10 and genotype quality >15, and only sites in Hardy-Weinberg equilibrium, located on autosomal scaffolds and with <10% missing data and pooled read depth >250 and <470.

    **3A) LTD.filtered.max2.QUAL30.minDP10.minGQ15.miss0.9.variant.sa...

  11. n

    Data from: Resolving evolutionary relationships in closely related species...

    • data.niaid.nih.gov
    • datadryad.org
    zip
    Updated Jun 29, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Nater; Reto Burri; Takeshi Kawakami; Linnéa Smeds; Hans Ellegren (2015). Resolving evolutionary relationships in closely related species with whole-genome sequencing data [Dataset]. http://doi.org/10.5061/dryad.b6gj8
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 29, 2015
    Dataset provided by
    Uppsala University
    Authors
    Alexander Nater; Reto Burri; Takeshi Kawakami; Linnéa Smeds; Hans Ellegren
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Using genetic data to resolve the evolutionary relationships of species is of major interest in evolutionary and systematic biology. However, reconstructing the sequence of speciation events, the so-called species tree, in closely related and potentially hybridizing species is very challenging. Processes such as incomplete lineage sorting and interspecific gene flow result in local gene genealogies that differ in their topology from the species tree, and analyses of few loci with a single sequence per species are likely to produce conflicting or even misleading results. To study these phenomena on a full phylogenomic scale, we use whole-genome sequence data from 200 individuals of four black-and-white flycatcher species with so far unresolved phylogenetic relationships to infer gene tree topologies and visualize genome-wide patterns of gene tree incongruence. Using phylogenetic analysis in nonoverlapping 10-kb windows, we show that gene tree topologies are extremely diverse and change on a very small physical scale. Moreover, we find strong evidence for gene flow among flycatcher species, with distinct patterns of reduced introgression on the Z chromosome. To resolve species relationships on the background of widespread gene tree incongruence, we used four complementary coalescent-based methods for species tree reconstruction, including complex modeling approaches that incorporate post-divergence gene flow among species. This allowed us to infer the most likely species tree with high confidence. Based on this finding, we show that regions of reduced effective population size, which have been suggested as particularly useful for species tree inference, can produce positively misleading species tree topologies. Our findings disclose the pitfalls of using loci potentially under selection as phylogenetic markers and highlight the potential of modeling approaches to disentangle species relationships in systems with large effective population sizes and post-divergence gene flow.

  12. G

    Genomic Data Analysis Service Report

    • archivemarketresearch.com
    doc, pdf, ppt
    Updated Mar 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Archive Market Research (2025). Genomic Data Analysis Service Report [Dataset]. https://www.archivemarketresearch.com/reports/genomic-data-analysis-service-55807
    Explore at:
    pdf, doc, pptAvailable download formats
    Dataset updated
    Mar 10, 2025
    Dataset authored and provided by
    Archive Market Research
    License

    https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The global Genomic Data Analysis Service market is experiencing robust growth, projected to reach $4192.3 million in 2025. While the provided CAGR is missing, considering the rapid advancements in genomics technologies and increasing demand for personalized medicine, a conservative estimate of 15% CAGR from 2025-2033 seems reasonable. This implies significant market expansion, driven by factors such as decreasing sequencing costs, growing adoption of next-generation sequencing (NGS) technologies, and the increasing need for efficient and accurate analysis of large genomic datasets. The market is segmented by application (humanity, plant, animal, microorganism, virus) and by type of analysis (whole genome sequence analysis, whole exome sequence analysis, and others). The growth is fueled by the expanding application of genomic analysis across diverse sectors like healthcare, agriculture, and environmental science. Whole genome sequencing is expected to dominate the market due to its comprehensive nature, providing a complete picture of an organism's genetic makeup. However, whole exome sequencing remains a significant segment due to its cost-effectiveness and ability to target specific protein-coding regions. Key players such as Illumina, QIAGEN, and BGI Genomics are leading the market through continuous innovation in software and analytical tools. The market's geographical spread is substantial, with North America and Europe currently holding the largest market shares due to well-established research infrastructure and technological advancements. However, the Asia-Pacific region is projected to witness significant growth driven by rising investments in healthcare infrastructure and increasing adoption of genomic technologies. The market is expected to continue its upward trajectory throughout the forecast period (2025-2033), driven by ongoing technological innovations that enhance data analysis speed and accuracy. The increasing availability of large genomic datasets, fueled by large-scale genomics initiatives, provides a fertile ground for the development of advanced analytical tools. Furthermore, the increasing demand for personalized medicine and precision agriculture is further accelerating the adoption of genomic data analysis services. However, challenges remain, including the need for standardized data formats, data security concerns associated with handling sensitive genomic data, and the need for skilled professionals to interpret and utilize the complex data generated. Addressing these challenges will be critical for continued market growth and widespread adoption of genomic data analysis services.

  13. s

    Data from: Genomic characterization of relapsed acute myeloid leukemia...

    • figshare.scilifelab.se
    • researchdata.se
    Updated Jan 15, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Linda Holmfeldt; Svea Stratmann (2025). Data from: Genomic characterization of relapsed acute myeloid leukemia reveals novel putative therapeutic targets [Dataset]. http://doi.org/10.17044/scilifelab.12292778.v1
    Explore at:
    Dataset updated
    Jan 15, 2025
    Dataset provided by
    Uppsala Universitet
    Authors
    Linda Holmfeldt; Svea Stratmann
    License

    https://www.scilifelab.se/data/restricted-access/https://www.scilifelab.se/data/restricted-access/

    Description

    Data Set DescriptionThese data are collected from a total of 73 participants (48 adult; 25 pediatric), all of which had relapsed or primary resistant acute myeloid leukemia. The data, which here are separated into an adult and a pediatric dataset, were generated as part of a study by Stratmann et. al., 2021 (https://doi.org/10.1182/bloodadvances.2020003709).Please note that separate applications are necessary for the adult and pediatric dataset, respectively. When applying for access, please indicate which of the datasets that the application applies for.The adult dataset contains whole genome sequencing data from 62 diagnosis (D), relapse (R1/R2/R3) and/or primary resistant (PR) leukemia samples, and 37 normal (G) DNA samples from 37 patients, as well as whole exome sequencing data from 20 leukemia samples and one (1) normal DNA sample from 15 patients.The pediatric dataset contains whole genome sequencing data from 49 diagnosis (D), relapse (R1/R2/R3), persistent relapse (R1/2-P) and/or primary resistant (PR) leukemia samples and 24 normal (G) DNA samples from 23 patients, as well as whole exome sequencing data from seven (7) leukemia samples from five (5) patients.The leukemia samples originate from bone marrow or peripheral blood. The patient-matched normal DNA samples originate from either complete remission bone marrow or peripheral blood cells, or from normal bone marrow stromal cells cultivated from leukemia bone marrow. Further details regarding the samples are available in the Supplemental Information part of Stratmann et. al., 2021 (https://doi.org/10.1182/bloodadvances.2020003709).Whole genome sequencing libraries and associated next-generation sequencing were carried out by the SNP&SEQ Technology platform, SciLifeLab, National Genomics Infrastructure Uppsala, Sweden. Libraries were prepared using the TruSeq PCR-free DNA sample preparation Kit, followed by paired-end 150bp read length sequencing on a HiSeqX (Illumina Inc.).Whole exome sequencing libraries and associated next-generation sequencing were carried out by the Uppsala Genome Center, SciLifeLab, National Genomics Infrastructure Uppsala, Sweden. Libraries were performed using the Ion AmpliSeq Whole Exome Library Preparation kit, followed by sequencing on the Ion Proton System using Ion PI Chip v2 and Ion PI Sequencing 200 Kit v3 chemistry (Thermo Fisher Scientific).Terms for accessThe adult and pediatric datasets are only to be used for research that is seeking to advance the understanding of the influence of genetic factors on human acute myeloid leukemia etiology and biology.Use of the protected pediatric dataset is only for research projects that can merely be conducted using pediatric acute myeloid leukemia data, and for which the research objectives cannot be accomplished using data from adults. Applications intending various method development would thus not be considered as acceptable for use of the pediatric dataset. Further, the pediatric dataset may not be used for research investigating predisposition for acute myeloid leukemia based on germline variants.To apply for conditional access to the adult and/or pediatric dataset in this publication, please contact datacentre@scilifelab.se.

  14. u

    Whole genome sequencing of Centaurea diffusa: native individual from Turkey

    • data.nkn.uidaho.edu
    • verso.uidaho.edu
    file explorer
    Updated Oct 24, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of British Columbia (2015). Whole genome sequencing of Centaurea diffusa: native individual from Turkey [Dataset]. https://data.nkn.uidaho.edu/dataset/whole-genome-sequencing-centaurea-diffusa-native-individual-turkey
    Explore at:
    file explorerAvailable download formats
    Dataset updated
    Oct 24, 2015
    Dataset provided by
    U.S. National Institutes of Health - SRA Database
    Authors
    University of British Columbia
    License

    https://creativecommons.org/licenses/publicdomain/https://creativecommons.org/licenses/publicdomain/

    https://spdx.org/licenses/CC-PDDChttps://spdx.org/licenses/CC-PDDC

    Description

    Raw data used to produce the draft reference genome for https://doi.org/10.3389/fevo.2020.577635. Genomic Analyses of Phenotypic Differences Between Native and Invasive Populations of Diffuse Knapweed (Centaurea diffusa). Young leaf tissue was sampled from a single individual (TR001-1L) and stored at -80° C. DNA was extracted from frozen tissue using a modified DNeasy column-less protocol. Concentration and quality was verified by Nanodrop, Qubit high-sensitivity assay, and gel electrophoresis. This whole genome shotgun library was sequenced at Genome Quebec, using one half lane of Illumina HiSeq 2000 paired-end sequencing.

  15. d

    Data from: Angus Sequence Data: Animal 186-6

    • catalog.data.gov
    • agdatacommons.nal.usda.gov
    • +1more
    Updated Apr 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). Angus Sequence Data: Animal 186-6 [Dataset]. https://catalog.data.gov/dataset/angus-sequence-data-animal-186-6-dea96
    Explore at:
    Dataset updated
    Apr 21, 2025
    Dataset provided by
    Agricultural Research Service
    Description

    Whole genome sequence data for Bovidae Bos taurus - beef Angus. The data is in "fastq" format.The National Animal Germplasm Program does not have germplasm for this animal.There are two versions of each file because we did paired end sequencing. There are two reads for each of the 210 data lines (a forward and a reverse read) summing to 420 total. A diagram of this is provided in the Collection Dataset. In the diagram, the two reads would correspond to read 1 and read 3.Resources in this dataset:Resource Title: Animal 186-6 Sequence Data - SCINet. File Name: Web Page, url: https://app.globus.org/file-manager?origin_id=904c2108-90cf-11e8-9672-0a6d4e044368&origin_path=/LTS/ADCdatastorage/NAL/published/node33204/tar file containing 14 files. The files are: RAPiD-Genomics_F112_CSU_136201_P001_WA04_i5-515_i7-100_S4_L003_R1_001.fastq.gz RAPiD-Genomics_F112_CSU_136201_P001_WA04_i5-515_i7-100_S4_L003_R2_001.fastq.gz RAPiD-Genomics-F113-CSU-136201-P001-WA04-i5-515-i7-100_S140_L002_R1_001.fastq.gz RAPiD-Genomics-F113-CSU-136201-P001-WA04-i5-515-i7-100_S140_L002_R2_001.fastq.gz RAPiD-Genomics-F113-CSU-136201-P001-WA04-i5-515-i7-100_S4_L001_R1_001.fastq.gz RAPiD-Genomics-F113-CSU-136201-P001-WA04-i5-515-i7-100_S4_L001_R2_001.fastq.gz RAPiD-Genomics-F114-CSU-136201-P001-WA04-i5-515-i7-100_S4_L003_R1_001.fastq.gz RAPiD-Genomics-F114-CSU-136201-P001-WA04-i5-515-i7-100_S4_L003_R2_001.fastq.gz RAPiD-Genomics_F115_CSU_136201_P001_WA04_i5-515_i7-100_S4_L001_R1_001.fastq.gz RAPiD-Genomics_F115_CSU_136201_P001_WA04_i5-515_i7-100_S4_L001_R2_001.fastq.gz RAPiD-Genomics_F115_CSU_136201_P001_WA04_i5-515_i7-100_S4_L002_R1_001.fastq.gz RAPiD-Genomics_F115_CSU_136201_P001_WA04_i5-515_i7-100_S4_L002_R2_001.fastq.gz RAPiD-Genomics_F116_CSU_136201_P001_WA04_i5-515_i7-100_S359_L002_R1_001.fastq.gz RAPiD-Genomics_F116_CSU_136201_P001_WA04_i5-515_i7-100_S359_L002_R2_001.fastq.gzSCINet users: The .tar file can be accessed/retrieved with valid SCINet account at this location: /LTS/ADCdatastorage/NAL/published/node33204/186-6.tarSee the SCINet File Transfer guide for more information on moving large files: https://scinet.usda.gov/guides/data/datatransferGlobus users: The files can also be accessed through Globus by following this data link. The user will need to log in to Globus in order to retrieve this data. User accounts are free of charge with several options for signing on. Instructions for creating an account are on the login page.

  16. d

    Whole Genome Shotgun Submissions

    • catalog.data.gov
    • healthdata.gov
    • +3more
    Updated Jun 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Library of Medicine (2025). Whole Genome Shotgun Submissions [Dataset]. https://catalog.data.gov/dataset/whole-genome-shotgun-submissions
    Explore at:
    Dataset updated
    Jun 19, 2025
    Dataset provided by
    National Library of Medicine
    Description

    Whole Genome Shotgun (WGS) projects are genome assemblies of incomplete genomes or incomplete chromosomes of prokaryotes or eukaryotes that are generally being sequenced by a whole genome shotgun strategy. WGS projects may be annotated, but annotation is not required. NCBI has a Prokaryotic Genomes Annotation Pipeline that may be requested at the time the genome files are submitted to GenBank. This pipeline generates a submission-ready annotated file that is posted back to the submitter for review and which the submitter could edit prior to data release. The public WGS projects are at the list of WGS projects. https://www.ncbi.nlm.nih.gov/Traces/wgs/

  17. f

    FASTQ Alignment and Pre-Processing Quality Control Reports

    • plus.figshare.com
    bin
    Updated May 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Renato Santos; Manuel Corpas (2024). FASTQ Alignment and Pre-Processing Quality Control Reports [Dataset]. http://doi.org/10.25452/figshare.plus.21675005.v3
    Explore at:
    binAvailable download formats
    Dataset updated
    May 13, 2024
    Dataset provided by
    Figshare+
    Authors
    Renato Santos; Manuel Corpas
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset contains the quality control reports generated by the Sarek pipeline for alignment and pre-processing of the FASTQ data from the severe COVID-19 patient cohort.See related materials at: https://doi.org/10.25452/figshare.plus.c.6347534

  18. n

    Data from: Whole genome sequencing and rare variant analysis in essential...

    • data.niaid.nih.gov
    • explore.openaire.eu
    • +2more
    zip
    Updated Aug 26, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zagaa Odgerel; Shilpa Sonti; Nora Hernandez; Jemin Park; Ruth Ottman; Elan D. Louis; Lorraine N. Clark (2019). Whole genome sequencing and rare variant analysis in essential tremor families [Dataset]. http://doi.org/10.5061/dryad.td8d20v
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 26, 2019
    Authors
    Zagaa Odgerel; Shilpa Sonti; Nora Hernandez; Jemin Park; Ruth Ottman; Elan D. Louis; Lorraine N. Clark
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Essential tremor (ET) is one of the most common movement disorders. The etiology of ET remains largely unexplained. Whole genome sequencing (WGS) is likely to be of value in understanding a large proportion of ET with Mendelian and complex disease inheritance patterns. In ET families with Mendelian inheritance patterns, WGS may lead to gene identification where WES analysis failed to identify the causative single nucleotide variant (SNV) or indel due to incomplete coverage of the entire coding region of the genome, in addition to accurate detection of larger structural variants (SVs) and copy number variants (CNVs). Alternatively, in ET families with complex disease inheritance patterns with gene x gene and gene x environment interactions enrichment of functional rare coding and non-coding variants may explain the heritability of ET. We performed WGS in eight ET families (n=40 individuals) enrolled in the Family Study of Essential Tremor. The analysis included filtering WGS data based on allele frequency in population databases, rare SNV and indel classification and association testing using the Mixed-Model Kernel Based Adaptive Cluster (MM-KBAC) test. A separate analysis of rare SV and CNVs segregating within ET families was also performed. Prioritization of candidate genes identified within families was performed using phenolyzer. WGS analysis identified candidate genes for ET in 5/8 (62.5%) of the families analyzed. WES analysis in a subset of these families in our previously published study failed to identify candidate genes. In one family, we identified a deleterious and damaging variant (c.1367G>A, p.(Arg456Gln)) in the candidate gene, CACNA1G, which encodes the pore forming subunit of T-type Ca(2+) channels, CaV3.1, and is expressed in various motor pathways and has been previously implicated in neuronal autorhythmicity and ET. Other candidate genes identified include SLIT3 which encodes an axon guidance molecule and in three families, phenolyzer prioritized genes that are associated with hereditary neuropathies (family A, KARS, family B, KIF5A and family F, NTRK1). Functional studies of CACNA1G and SLIT3 suggest a role for these genes in ET disease pathogenesis.

  19. N

    Next-Generation Sequencing Data Analysis Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Jun 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Next-Generation Sequencing Data Analysis Report [Dataset]. https://www.datainsightsmarket.com/reports/next-generation-sequencing-data-analysis-590268
    Explore at:
    doc, pdf, pptAvailable download formats
    Dataset updated
    Jun 22, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Next-Generation Sequencing (NGS) Data Analysis market is experiencing robust growth, driven by the increasing adoption of NGS technologies in various fields, including genomics research, oncology, and personalized medicine. The market's expansion is fueled by several key factors: the declining cost of NGS sequencing, the rising prevalence of chronic diseases necessitating personalized treatment approaches, and the continuous advancements in bioinformatics and data analysis algorithms capable of handling the vast amounts of data generated by NGS. A significant market driver is the growing demand for faster, more accurate, and cost-effective diagnostic tools, particularly in oncology, where NGS plays a crucial role in identifying cancer mutations for targeted therapies. Furthermore, the increasing availability of large-scale genomic datasets and the development of cloud-based data analysis platforms are contributing to market expansion. The competitive landscape is dynamic, with major players like Thermo Fisher Scientific, Illumina, and QIAGEN constantly innovating and expanding their product portfolios. The market is segmented by technology (e.g., whole genome sequencing, exome sequencing), application (e.g., oncology, diagnostics, agriculture), and end-user (e.g., research institutions, hospitals). Despite the rapid growth, the market faces certain challenges. The complexity of NGS data analysis requires specialized expertise and sophisticated software, leading to high operational costs and potential bottlenecks. Data security and privacy concerns are also significant, especially as sensitive patient information is increasingly processed. The need for skilled bioinformaticians and data scientists is outpacing current supply, creating a talent shortage. However, ongoing investments in educational programs and training initiatives are addressing this limitation. Future growth will likely be driven by the integration of artificial intelligence and machine learning in data analysis, further accelerating the pace of discovery and improving clinical decision-making. The development of more user-friendly and accessible data analysis tools will also play a critical role in broadening market adoption and making NGS technology readily available to a wider range of users.

  20. d

    Data from: Paired omics Data Platform projects

    • doi.org
    • explore.openaire.eu
    • +3more
    zip
    Updated Jan 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stefan Verhoeven; Justin J.J. van der Hooft; Marnix H. Medema; Pieter C. Dorrestein; Michelle Schorn (2023). Paired omics Data Platform projects [Dataset]. http://doi.org/10.5281/zenodo.7497442
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 1, 2023
    Dataset provided by
    University of California, San Diego, La Jolla, United States of America
    Netherlands eScience Center, Amsterdam, NL
    Wageningen University, Wageningen, NL
    Authors
    Stefan Verhoeven; Justin J.J. van der Hooft; Marnix H. Medema; Pieter C. Dorrestein; Michelle Schorn
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Paired Omics Data Platform is a community-based initiative standardizing links between genomic and metabolomics data in a computer readable format to further the field of natural products discovery. The goals are to link molecules to their producers, find large scale genome-metabolome associations, use genomic data to assist in structural elucidation of molecules, and provide a centralized database for paired datasets. This dataset contains the projects in http://pairedomicsdata.bioinformatics.nl/.

    The JSON documents adhere to the http://pairedomicsdata.bioinformatics.nl/schema.json JSON schema.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Ricardo Ramiro (2019). Whole-genome sequencing raw data [Dataset]. http://doi.org/10.6084/m9.figshare.10048712.v1
Organization logoOrganization logo

Whole-genome sequencing raw data

Explore at:
73 scholarly articles cite this dataset (View in Google Scholar)
txtAvailable download formats
Dataset updated
Nov 20, 2019
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Ricardo Ramiro
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Whole genome sequencing raw data for the manuscript "Low Mutational Load Allows for High Mutation Rate Variation in Gut Commensal Bacteria" (https://doi.org/10.1101/568709)All fastq files generated for this project are made available in FigShare and are deposited in the NCBI short-read archive: https://www.ncbi.nlm.nih.gov/bioproject/PRJNA580181/The first three codes in a file name, separated by underscores, indicate the identity of the sequenced clone, which is further described in the file clone_description.csv.Some clones have two sets of R1 and R2 files, as these were sequenced twice to achieve the required coverage.These data will be deposited in the NCBI short-read archive.

Search
Clear search
Close search
Google apps
Main menu