100+ datasets found
  1. E

    Whole Exome Sequencing

    • ega-archive.org
    Updated Sep 21, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). Whole Exome Sequencing [Dataset]. https://ega-archive.org/datasets/EGAD00001004352
    Explore at:
    Dataset updated
    Sep 21, 2018
    License

    https://ega-archive.org/dacs/EGAC00001001008https://ega-archive.org/dacs/EGAC00001001008

    Description

    The Whole Exome Sequencing dataset contains 30 whole exome sequencing files (tumor, germ line DNA) and phenotype metadata for 15 patients on the phase II clinical trial of neoadjuvant immune checkpoint blockade in high-risk resectable melanoma at MD Anderson Cancer Center (NCT02519322). Included are data from baseline samples.

  2. NA12878 WES Benchmark dataset

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip, bin
    Updated May 31, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pranckeviciene Erinija; Pranckeviciene Erinija (2020). NA12878 WES Benchmark dataset [Dataset]. http://doi.org/10.5281/zenodo.3597727
    Explore at:
    application/gzip, binAvailable download formats
    Dataset updated
    May 31, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Pranckeviciene Erinija; Pranckeviciene Erinija
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset makes available the UCSC Genome Browser (genome.ucsc.edu) GRCh37 genome build public session NA12878 WES Benchmark files in a single dataset so that these files can be used in other applications or genome browsers such as IGV. All genomic variant calls in all VCF files were decomposed and normalized with vt. This dataset contains:

    1. Genome in a bottle (GIAB) version 3.3.2 high confidence (HC) variant calls and genomic regions for HapMap individual NA12878 :
      1. GIAB_v3.3.2_NA12878-decomposed-normalized.vcf.gz
      2. GIAB_v3.3.2_NA12878-decomposed-normalized.vcf.gz.tbi
      3. GIAB_v3.3.2_NA12878_HC_regions.bed
    2. HapMap individual NA12878 WES variant calls (VCF) and capture regions (BED) from diagnostic laboratories :
      • ARUP whole exome sequencing data (HiSeq 2000) publically available from NCBI GeT-RM Browser
        1. converted_ARUP_NA12878_Exome-decomposed-normalized.vcf.gz
        2. converted_ARUP_NA12878_Exome-decomposed-normalized.vcf.gz.tbi
        3. ARUP_SeqCap_EZ_Exome.bed
      • UCSF whole exome sequencing data (HiSeq 2500) publically available from NCBI GeT-RM Browser
        1. converted_UCSF_NA12878_WES_Agilent_V4_Custom-decomposed-normalized.vcf.gz
        2. converted_UCSF_NA12878_WES_Agilent_V4_Custom-decomposed-normalized.vcf.gz.tbi
        3. UCSF_WES_Agilent_V4_Custom.bed
      • Whole exome data (NextSeq 500) sequenced in CHEO diagnostic laboratory
        1. CHEO_NA12878_WES_S1dataset.vcf.gz
        2. CHEO_NA12878_WES_S1dataset.vcf.gz.tbi
        3. Agilent_CRE_v2.bed
    3. Genomic coordinates (BED) of OMIM genes for which a molecular basis of the associated disease is known (as of September 2019) :
      • Omim_Genes.bed

  3. f

    Overview of whole exome sequencing data production.

    • plos.figshare.com
    xls
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xiao-Ping Qi; Ju-Ming Ma; Zhen-Fang Du; Rong-Biao Ying; Jun Fei; Hang-Yang Jin; Jian-Shan Han; Jin-Quan Wang; Xiao-Ling Chen; Chun-Yue Chen; Wen-Ting Liu; Jia-Jun Lu; Jian-Guo Zhang; Xian-Ning Zhang (2023). Overview of whole exome sequencing data production. [Dataset]. http://doi.org/10.1371/journal.pone.0020353.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Xiao-Ping Qi; Ju-Ming Ma; Zhen-Fang Du; Rong-Biao Ying; Jun Fei; Hang-Yang Jin; Jian-Shan Han; Jin-Quan Wang; Xiao-Ling Chen; Chun-Yue Chen; Wen-Ting Liu; Jia-Jun Lu; Jian-Guo Zhang; Xian-Ning Zhang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Overview of whole exome sequencing data production.

  4. Whole-genome sequencing raw data

    • figshare.com
    txt
    Updated Nov 20, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ricardo Ramiro (2019). Whole-genome sequencing raw data [Dataset]. http://doi.org/10.6084/m9.figshare.10048712.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Nov 20, 2019
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Ricardo Ramiro
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Whole genome sequencing raw data for the manuscript "Low Mutational Load Allows for High Mutation Rate Variation in Gut Commensal Bacteria" (https://doi.org/10.1101/568709)All fastq files generated for this project are made available in FigShare and are deposited in the NCBI short-read archive: https://www.ncbi.nlm.nih.gov/bioproject/PRJNA580181/The first three codes in a file name, separated by underscores, indicate the identity of the sequenced clone, which is further described in the file clone_description.csv.Some clones have two sets of R1 and R2 files, as these were sequenced twice to achieve the required coverage.These data will be deposited in the NCBI short-read archive.

  5. E

    Whole-exome and whole-genome sequencing data

    • ega-archive.org
    Updated Jun 12, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2019). Whole-exome and whole-genome sequencing data [Dataset]. https://ega-archive.org/datasets/EGAD00001005087
    Explore at:
    Dataset updated
    Jun 12, 2019
    License

    https://ega-archive.org/dacs/EGAC00001001233https://ega-archive.org/dacs/EGAC00001001233

    Description

    Multi-omic data for lung neuroendocrine neoplasms, including the first multi-omic sequencing data for the understudied lung atypical carcinoids. The data includes Whole-exomes, whole-genomes, RNA-seq, and EPIC 850K methylation array data.

  6. s

    Data from: Genomic characterization of relapsed acute myeloid leukemia...

    • figshare.scilifelab.se
    • researchdata.se
    Updated Jan 15, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Linda Holmfeldt; Svea Stratmann (2025). Data from: Genomic characterization of relapsed acute myeloid leukemia reveals novel putative therapeutic targets [Dataset]. http://doi.org/10.17044/scilifelab.12292778.v1
    Explore at:
    Dataset updated
    Jan 15, 2025
    Dataset provided by
    Uppsala Universitet
    Authors
    Linda Holmfeldt; Svea Stratmann
    License

    https://www.scilifelab.se/data/restricted-access/https://www.scilifelab.se/data/restricted-access/

    Description

    Data Set DescriptionThese data are collected from a total of 73 participants (48 adult; 25 pediatric), all of which had relapsed or primary resistant acute myeloid leukemia. The data, which here are separated into an adult and a pediatric dataset, were generated as part of a study by Stratmann et. al., 2021 (https://doi.org/10.1182/bloodadvances.2020003709).Please note that separate applications are necessary for the adult and pediatric dataset, respectively. When applying for access, please indicate which of the datasets that the application applies for.The adult dataset contains whole genome sequencing data from 62 diagnosis (D), relapse (R1/R2/R3) and/or primary resistant (PR) leukemia samples, and 37 normal (G) DNA samples from 37 patients, as well as whole exome sequencing data from 20 leukemia samples and one (1) normal DNA sample from 15 patients.The pediatric dataset contains whole genome sequencing data from 49 diagnosis (D), relapse (R1/R2/R3), persistent relapse (R1/2-P) and/or primary resistant (PR) leukemia samples and 24 normal (G) DNA samples from 23 patients, as well as whole exome sequencing data from seven (7) leukemia samples from five (5) patients.The leukemia samples originate from bone marrow or peripheral blood. The patient-matched normal DNA samples originate from either complete remission bone marrow or peripheral blood cells, or from normal bone marrow stromal cells cultivated from leukemia bone marrow. Further details regarding the samples are available in the Supplemental Information part of Stratmann et. al., 2021 (https://doi.org/10.1182/bloodadvances.2020003709).Whole genome sequencing libraries and associated next-generation sequencing were carried out by the SNP&SEQ Technology platform, SciLifeLab, National Genomics Infrastructure Uppsala, Sweden. Libraries were prepared using the TruSeq PCR-free DNA sample preparation Kit, followed by paired-end 150bp read length sequencing on a HiSeqX (Illumina Inc.).Whole exome sequencing libraries and associated next-generation sequencing were carried out by the Uppsala Genome Center, SciLifeLab, National Genomics Infrastructure Uppsala, Sweden. Libraries were performed using the Ion AmpliSeq Whole Exome Library Preparation kit, followed by sequencing on the Ion Proton System using Ion PI Chip v2 and Ion PI Sequencing 200 Kit v3 chemistry (Thermo Fisher Scientific).Terms for accessThe adult and pediatric datasets are only to be used for research that is seeking to advance the understanding of the influence of genetic factors on human acute myeloid leukemia etiology and biology.Use of the protected pediatric dataset is only for research projects that can merely be conducted using pediatric acute myeloid leukemia data, and for which the research objectives cannot be accomplished using data from adults. Applications intending various method development would thus not be considered as acceptable for use of the pediatric dataset. Further, the pediatric dataset may not be used for research investigating predisposition for acute myeloid leukemia based on germline variants.To apply for conditional access to the adult and/or pediatric dataset in this publication, please contact datacentre@scilifelab.se.

  7. G

    Genomic Data Analysis Service Report

    • archivemarketresearch.com
    doc, pdf, ppt
    Updated Mar 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Archive Market Research (2025). Genomic Data Analysis Service Report [Dataset]. https://www.archivemarketresearch.com/reports/genomic-data-analysis-service-55611
    Explore at:
    pdf, ppt, docAvailable download formats
    Dataset updated
    Mar 10, 2025
    Dataset authored and provided by
    Archive Market Research
    License

    https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The global genomic data analysis service market is experiencing robust growth, projected to reach $1769.8 million in 2025 and exhibiting a Compound Annual Growth Rate (CAGR) of 13.1% from 2025 to 2033. This expansion is fueled by several key factors. Advances in next-generation sequencing (NGS) technologies are generating massive datasets, driving the demand for sophisticated analytical tools and services. The increasing affordability of genomic sequencing, coupled with expanding applications across healthcare, agriculture, and environmental research, further contributes to market growth. Specifically, the human application segment dominates, driven by personalized medicine initiatives and the growing understanding of the role of genomics in disease diagnosis and treatment. The rise of cloud-based data analysis platforms, offering scalability and cost-effectiveness, is also a significant driver. While data security and privacy concerns present a challenge, the development of robust data management and security protocols is mitigating this risk. Furthermore, the growing adoption of AI and machine learning in genomic data analysis enhances accuracy and efficiency, accelerating market growth. Segmentation within the market reveals strong performance across various application areas. Whole Genome Sequence Analysis and Whole Exome Sequence Analysis segments are major contributors, reflecting the comprehensive nature of the data generated and the insights derived. Geographically, North America currently holds a significant market share, driven by early adoption of advanced technologies and strong funding for research and development. However, Asia Pacific is anticipated to experience rapid growth, fueled by increasing investments in healthcare infrastructure and expanding genomics research activities in countries like China and India. Competition is intense, with established players like Illumina and QIAGEN alongside emerging companies offering specialized solutions. The continuous innovation in sequencing technologies and analytical methods ensures the ongoing evolution and expansion of this dynamic market.

  8. f

    Data from: The impact of post-alignment processing procedures on whole-exome...

    • scielo.figshare.com
    jpeg
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Murilo Guimarães Borges; Helena Tadiello de Moraes; Cristiane de Souza Rocha; Iscia Lopes-Cendes (2023). The impact of post-alignment processing procedures on whole-exome sequencing data [Dataset]. http://doi.org/10.6084/m9.figshare.14320533.v1
    Explore at:
    jpegAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    SciELO journals
    Authors
    Murilo Guimarães Borges; Helena Tadiello de Moraes; Cristiane de Souza Rocha; Iscia Lopes-Cendes
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Abstract The use of post-alignment procedures has been suggested to prevent the identification of false-positives in massive DNA sequencing data. Insertions and deletions are most likely to be misinterpreted by variant calling algorithms. Using known genetic variants as references for post-processing pipelines can minimize mismatches. They allow reads to be correctly realigned and recalibrated, resulting in more parsimonious variant calling. In this work, we aim to investigate the impact of using different sets of common variants as references to facilitate variant calling from whole-exome sequencing data. We selected reference variants from common insertions and deletions available within the 1K Genomes project data and from databases from the Latin American Database of Genetic Variation (LatinGen). We used the Genome Analysis Toolkit to perform post-processing procedures like local realignment, quality recalibration procedures, and variant calling in whole exome samples. We identified an increased number of variants from the call set for all groups when no post-processing procedure was performed. We found that there was a higher concordance rate between variants called using 1K Genomes and LatinGen. Therefore, we believe that the increased number of rare variants identified in the analysis without realignment or quality recalibration indicated that they were likely false-positives.

  9. f

    DataSheet_2_SECNVs: A Simulator of Copy Number Variants and Whole-Exome...

    • figshare.com
    pdf
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yue Xing; Alan R. Dabney; Xiao Li; Guosong Wang; Clare A. Gill; Claudio Casola (2023). DataSheet_2_SECNVs: A Simulator of Copy Number Variants and Whole-Exome Sequences From Reference Genomes.pdf [Dataset]. http://doi.org/10.3389/fgene.2020.00082.s002
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Frontiers
    Authors
    Yue Xing; Alan R. Dabney; Xiao Li; Guosong Wang; Clare A. Gill; Claudio Casola
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Copy number variants are duplications and deletions of the genome that play an important role in phenotypic changes and human disease. Many software applications have been developed to detect copy number variants using either whole-genome sequencing or whole-exome sequencing data. However, there is poor agreement in the results from these applications. Simulated datasets containing copy number variants allow comprehensive comparisons of the operating characteristics of existing and novel copy number variant detection methods. Several software applications have been developed to simulate copy number variants and other structural variants in whole-genome sequencing data. However, none of the applications reliably simulate copy number variants in whole-exome sequencing data. We have developed and tested Simulator of Exome Copy Number Variants (SECNVs), a fast, robust and customizable software application for simulating copy number variants and whole-exome sequences from a reference genome. SECNVs is easy to install, implements a wide range of commands to customize simulations, can output multiple samples at once, and incorporates a pipeline to output rearranged genomes, short reads and BAM files in a single command. Variants generated by SECNVs are detected with high sensitivity and precision by tools commonly used to detect copy number variants. SECNVs is publicly available at https://github.com/YJulyXing/SECNVs.

  10. Semantic prioritization of novel causative genomic variants

    • plos.figshare.com
    pdf
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Imane Boudellioua; Rozaimi B. Mahamad Razali; Maxat Kulmanov; Yasmeen Hashish; Vladimir B. Bajic; Eva Goncalves-Serra; Nadia Schoenmakers; Georgios V. Gkoutos; Paul N. Schofield; Robert Hoehndorf (2023). Semantic prioritization of novel causative genomic variants [Dataset]. http://doi.org/10.1371/journal.pcbi.1005500
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Imane Boudellioua; Rozaimi B. Mahamad Razali; Maxat Kulmanov; Yasmeen Hashish; Vladimir B. Bajic; Eva Goncalves-Serra; Nadia Schoenmakers; Georgios V. Gkoutos; Paul N. Schofield; Robert Hoehndorf
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Discriminating the causative disease variant(s) for individuals with inherited or de novo mutations presents one of the main challenges faced by the clinical genetics community today. Computational approaches for variant prioritization include machine learning methods utilizing a large number of features, including molecular information, interaction networks, or phenotypes. Here, we demonstrate the PhenomeNET Variant Predictor (PVP) system that exploits semantic technologies and automated reasoning over genotype-phenotype relations to filter and prioritize variants in whole exome and whole genome sequencing datasets. We demonstrate the performance of PVP in identifying causative variants on a large number of synthetic whole exome and whole genome sequences, covering a wide range of diseases and syndromes. In a retrospective study, we further illustrate the application of PVP for the interpretation of whole exome sequencing data in patients suffering from congenital hypothyroidism. We find that PVP accurately identifies causative variants in whole exome and whole genome sequencing datasets and provides a powerful resource for the discovery of causal variants.

  11. Training data for 'Exome sequencing data analysis' tutorial (Galaxy Training...

    • zenodo.org
    • data.niaid.nih.gov
    bin
    Updated Aug 4, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wolfgang Maier; Wolfgang Maier (2022). Training data for 'Exome sequencing data analysis' tutorial (Galaxy Training Material) [Dataset]. http://doi.org/10.5281/zenodo.3054169
    Explore at:
    binAvailable download formats
    Dataset updated
    Aug 4, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Wolfgang Maier; Wolfgang Maier
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The data used in this tutorial are a subset of the data published previously in Training material for the course "Exome analysis with GALAXY". Credit for uploading the original data goes to Paolo Uva and Gianmauro Cuccuru!

    Specifically, you may need the following datasets for following the tutorial:

    Raw sequencing reads

    Premapped sequencing reads

    Reference sequence (human chromosome 8)

    If you would just like to play with GEMINI rather than work through the full tutorial, you'll find below a prebuilt GEMINI database (for GEMINI version 0.20.1) for the family trio. You can start exploring this database without having to run GEMINI load and, in fact, without having to install GEMINI's bundled annotation data.

  12. E

    Whole Exome Sequencing Data of prDLBCL

    • ega-archive.org
    Updated Sep 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Whole Exome Sequencing Data of prDLBCL [Dataset]. https://ega-archive.org/datasets/EGAD50000000591
    Explore at:
    Dataset updated
    Sep 16, 2024
    License

    https://ega-archive.org/dacs/EGAC50000000284https://ega-archive.org/dacs/EGAC50000000284

    Description

    We used sequenced 34 prDLBCL samples using whole exome sequencing (WES) data to evaluate possible mutational signatures and driver mutations associated with the patient’s clinical and cytogenetic characteristics.

  13. G

    Genomic Data Analysis Service Report

    • archivemarketresearch.com
    doc, pdf, ppt
    Updated Mar 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Archive Market Research (2025). Genomic Data Analysis Service Report [Dataset]. https://www.archivemarketresearch.com/reports/genomic-data-analysis-service-55807
    Explore at:
    pdf, doc, pptAvailable download formats
    Dataset updated
    Mar 10, 2025
    Dataset authored and provided by
    Archive Market Research
    License

    https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The global Genomic Data Analysis Service market is experiencing robust growth, projected to reach $4192.3 million in 2025. While the provided CAGR is missing, considering the rapid advancements in genomics technologies and increasing demand for personalized medicine, a conservative estimate of 15% CAGR from 2025-2033 seems reasonable. This implies significant market expansion, driven by factors such as decreasing sequencing costs, growing adoption of next-generation sequencing (NGS) technologies, and the increasing need for efficient and accurate analysis of large genomic datasets. The market is segmented by application (humanity, plant, animal, microorganism, virus) and by type of analysis (whole genome sequence analysis, whole exome sequence analysis, and others). The growth is fueled by the expanding application of genomic analysis across diverse sectors like healthcare, agriculture, and environmental science. Whole genome sequencing is expected to dominate the market due to its comprehensive nature, providing a complete picture of an organism's genetic makeup. However, whole exome sequencing remains a significant segment due to its cost-effectiveness and ability to target specific protein-coding regions. Key players such as Illumina, QIAGEN, and BGI Genomics are leading the market through continuous innovation in software and analytical tools. The market's geographical spread is substantial, with North America and Europe currently holding the largest market shares due to well-established research infrastructure and technological advancements. However, the Asia-Pacific region is projected to witness significant growth driven by rising investments in healthcare infrastructure and increasing adoption of genomic technologies. The market is expected to continue its upward trajectory throughout the forecast period (2025-2033), driven by ongoing technological innovations that enhance data analysis speed and accuracy. The increasing availability of large genomic datasets, fueled by large-scale genomics initiatives, provides a fertile ground for the development of advanced analytical tools. Furthermore, the increasing demand for personalized medicine and precision agriculture is further accelerating the adoption of genomic data analysis services. However, challenges remain, including the need for standardized data formats, data security concerns associated with handling sensitive genomic data, and the need for skilled professionals to interpret and utilize the complex data generated. Addressing these challenges will be critical for continued market growth and widespread adoption of genomic data analysis services.

  14. Additional file 2 of dbNSFP v4: a comprehensive database of...

    • springernature.figshare.com
    xlsx
    Updated Feb 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xiaoming Liu; Chang Li; Chengcheng Mou; Yibo Dong; Yicheng Tu (2024). Additional file 2 of dbNSFP v4: a comprehensive database of transcript-specific functional predictions and annotations for human nonsynonymous and splice-site SNVs [Dataset]. http://doi.org/10.6084/m9.figshare.13316921.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Feb 13, 2024
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Xiaoming Liu; Chang Li; Chengcheng Mou; Yibo Dong; Yicheng Tu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Additional file 2: Table S2. Nonmissing counts of ssSNV, nsSNV and 45 scores per chromosome. Table S3. Density of rank scores based on 100 bins (bin size = 0.01). Table S4. Pearson’s correlation coefficients between rank scores. Table S5. Ratio of binary predictions’ agreement between scores. Table S6. AUROC/VUROC scores between TP testing set and different TN testing sets for the 45 deleterious prediction scores in dbNSFP v4.1.

  15. Z

    Data from: Aggregated variant data from whole-genome sequenced tinnitus...

    • data.niaid.nih.gov
    • produccioncientifica.ugr.es
    • +1more
    Updated Nov 9, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gallus, Silvano (2022). Aggregated variant data from whole-genome sequenced tinnitus patients (TIGER) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7304955
    Explore at:
    Dataset updated
    Nov 9, 2022
    Dataset provided by
    Escalera-Balsera, Alba
    Lopez-Escamez, Jose Antonio
    Gallego-Martinez, Alvaro
    Frejo, Lidia
    Perez-Carpena, Patricia
    Trpchevska, Natalia
    Robles-Bolivar, Paula
    Canlon, Barbara
    Gallus, Silvano
    Cederroth, Christopher R
    Roman-Naranjo, Pablo
    Bulla, Jan
    Description

    Aggregated variant data obtained from tinnitus patients from Sweden.

    Uploaded datasets are storage in annotated csv files. Annotation was performed using VEP (v106), including population frequencies for each variant from gnomAD, non-finnish Europeans from gnomAD, and swedish population from SweGen project. Pathogenicity scores from CADD are also annotated for each variant. Variants from genes found to be enriched in a gene burden analysis can be found in this aggregated dataset.

    agg.tiger.csv - TIGER cohort is composed by 97 swedish whole-genome sequenced constant tinnitus patients.

    agg.jaguar.csv - JAGUAR cohort is composed by 147 swedish whole-exome sequenced tinnitus patients .

    agg.sevtin.csv - SEVTIN cohort is a subcohort from TIGER, with 34 WGS patients seggregating severe tinnitus phenotype.

    agg.controls.csv - Controls is a swedish population cohort composed by 151 whole-exome sequenced swedish individuals.

  16. Assessing reliability of intra-tumor heterogeneity estimates from single...

    • plos.figshare.com
    tiff
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Judith Abécassis; Anne-Sophie Hamy; Cécile Laurent; Benjamin Sadacca; Hélène Bonsang-Kitzis; Fabien Reyal; Jean-Philippe Vert (2023). Assessing reliability of intra-tumor heterogeneity estimates from single sample whole exome sequencing data [Dataset]. http://doi.org/10.1371/journal.pone.0224143
    Explore at:
    tiffAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Judith Abécassis; Anne-Sophie Hamy; Cécile Laurent; Benjamin Sadacca; Hélène Bonsang-Kitzis; Fabien Reyal; Jean-Philippe Vert
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Tumors are made of evolving and heterogeneous populations of cells which arise from successive appearance and expansion of subclonal populations, following acquisition of mutations conferring them a selective advantage. Those subclonal populations can be sensitive or resistant to different treatments, and provide information about tumor aetiology and future evolution. Hence, it is important to be able to assess the level of heterogeneity of tumors with high reliability for clinical applications. In the past few years, a large number of methods have been proposed to estimate intra-tumor heterogeneity from whole exome sequencing (WES) data, but the accuracy and robustness of these methods on real data remains elusive. Here we systematically apply and compare 6 computational methods to estimate tumor heterogeneity on 1,697 WES samples from the cancer genome atlas (TCGA) covering 3 cancer types (breast invasive carcinoma, bladder urothelial carcinoma, and head and neck squamous cell carcinoma), and two distinct input mutation sets. We observe significant differences between the estimates produced by different methods, and identify several likely confounding factors in heterogeneity assessment for the different methods. We further show that the prognostic value of tumor heterogeneity for survival prediction is limited in those datasets, and find no evidence that it improves over prognosis based on other clinical variables. In conclusion, heterogeneity inference from WES data on a single sample, and its use in cancer prognosis, should be considered with caution. Other approaches to assess intra-tumoral heterogeneity such as those based on multiple samples may be preferable for clinical applications.

  17. N

    Next-Generation Sequencing Data Analysis Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Jun 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Next-Generation Sequencing Data Analysis Report [Dataset]. https://www.datainsightsmarket.com/reports/next-generation-sequencing-data-analysis-590268
    Explore at:
    doc, pdf, pptAvailable download formats
    Dataset updated
    Jun 22, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Next-Generation Sequencing (NGS) Data Analysis market is experiencing robust growth, driven by the increasing adoption of NGS technologies in various fields, including genomics research, oncology, and personalized medicine. The market's expansion is fueled by several key factors: the declining cost of NGS sequencing, the rising prevalence of chronic diseases necessitating personalized treatment approaches, and the continuous advancements in bioinformatics and data analysis algorithms capable of handling the vast amounts of data generated by NGS. A significant market driver is the growing demand for faster, more accurate, and cost-effective diagnostic tools, particularly in oncology, where NGS plays a crucial role in identifying cancer mutations for targeted therapies. Furthermore, the increasing availability of large-scale genomic datasets and the development of cloud-based data analysis platforms are contributing to market expansion. The competitive landscape is dynamic, with major players like Thermo Fisher Scientific, Illumina, and QIAGEN constantly innovating and expanding their product portfolios. The market is segmented by technology (e.g., whole genome sequencing, exome sequencing), application (e.g., oncology, diagnostics, agriculture), and end-user (e.g., research institutions, hospitals). Despite the rapid growth, the market faces certain challenges. The complexity of NGS data analysis requires specialized expertise and sophisticated software, leading to high operational costs and potential bottlenecks. Data security and privacy concerns are also significant, especially as sensitive patient information is increasingly processed. The need for skilled bioinformaticians and data scientists is outpacing current supply, creating a talent shortage. However, ongoing investments in educational programs and training initiatives are addressing this limitation. Future growth will likely be driven by the integration of artificial intelligence and machine learning in data analysis, further accelerating the pace of discovery and improving clinical decision-making. The development of more user-friendly and accessible data analysis tools will also play a critical role in broadening market adoption and making NGS technology readily available to a wider range of users.

  18. N

    Next Generation Sequencing (NGS) Report

    • marketresearchforecast.com
    doc, pdf, ppt
    Updated Apr 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Market Research Forecast (2025). Next Generation Sequencing (NGS) Report [Dataset]. https://www.marketresearchforecast.com/reports/next-generation-sequencing-ngs-145671
    Explore at:
    doc, pdf, pptAvailable download formats
    Dataset updated
    Apr 23, 2025
    Dataset authored and provided by
    Market Research Forecast
    License

    https://www.marketresearchforecast.com/privacy-policyhttps://www.marketresearchforecast.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Next-Generation Sequencing (NGS) market, currently valued at $3.797 billion (2025), is projected to experience robust growth, driven by a compound annual growth rate (CAGR) of 3.3% from 2025 to 2033. This expansion is fueled by several key factors. The increasing prevalence of genetic diseases and cancers necessitates advanced diagnostic tools, making NGS a critical technology for personalized medicine. Furthermore, the declining cost of sequencing and the development of more sophisticated and high-throughput platforms are making NGS more accessible to researchers and clinicians. The rising demand for genomic data in academic and government research, along with the growing adoption of NGS in pharmaceutical and biotechnology companies for drug discovery and development, is significantly contributing to market growth. Hospitals and clinics are also increasingly incorporating NGS into their workflows for diagnostics and personalized treatment plans, further bolstering the market's expansion. While regulatory hurdles and data interpretation complexities pose some challenges, the overall market outlook remains highly positive. The market segmentation reveals a significant contribution from targeted sequencing, driven by its cost-effectiveness and suitability for specific diagnostic applications. Whole exome and whole genome sequencing, although more expensive, are also witnessing substantial growth, driven by their comprehensive data output enabling deeper insights into genetic variations. Geographically, North America currently holds a significant market share due to advanced healthcare infrastructure and a strong research ecosystem. However, Asia Pacific is emerging as a rapidly growing market, driven by increasing investments in healthcare infrastructure and rising awareness of genomics in the region. The competitive landscape is dominated by major players like Illumina, Thermo Fisher Scientific, and Roche, along with several other significant players in the industry that are constantly innovating and expanding their product offerings. This continuous innovation within the NGS technology, combined with favorable regulatory changes and increasing investment in genomics research globally, promises sustained market growth in the coming years.

  19. d

    Whole genome sequencing of three North American large-bodied birds

    • catalog.data.gov
    • data.usgs.gov
    • +1more
    Updated Jul 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). Whole genome sequencing of three North American large-bodied birds [Dataset]. https://catalog.data.gov/dataset/whole-genome-sequencing-of-three-north-american-large-bodied-birds
    Explore at:
    Dataset updated
    Jul 6, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Description

    The data release details the samples, methods, and raw data used to generate high-quality genome assemblies for greater sage-grouse (Centrocercus urophasianus), white-tailed ptarmigan (Lagopus leucura), and trumpeter swan (Cygnus buccinator). The raw data have been deposited in the Sequence Read Archive (SRA) of the National Center for Biotechnology Information (NCBI), the authoritative repository for public biological sequence data, and are not included in this data release. Instead, the accessions that link to those data via the NCBI portal (www.ncbi.nlm.nih.gov) are provided herein. The release consists of a single file, sample.metadata.txt, which maps NCBI accessions to the samples sequenced and the different types of sequencing performed to generate the assemblies and annotate their gene features.

  20. c

    Alzheimer Disease Whole Exome Sequencing

    • portal.conp.ca
    Updated Jun 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Montreal Neurological Institute and Hospital Bioinformatics Core (2024). Alzheimer Disease Whole Exome Sequencing [Dataset]. https://portal.conp.ca/dataset?id=projects/cbig_alzheimers_disease_wes
    Explore at:
    Dataset updated
    Jun 26, 2024
    Dataset authored and provided by
    The Montreal Neurological Institute and Hospital Bioinformatics Core
    Description

    Whole exome sequencing data from 4 individuals with Alzheimer Disease were recruited at the Montreal Neurological Institute and Hospital. Recruiting clinician was Dr. Guy Rouleau. Sequencing reads were processed using a Burrows-Wheeler Aligner (BWA) alignment, Genome Analysis Toolkit (GATK)/Picard post-alignment, and GATK HaplotypeCaller calling pipeline.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2018). Whole Exome Sequencing [Dataset]. https://ega-archive.org/datasets/EGAD00001004352

Whole Exome Sequencing

Explore at:
Dataset updated
Sep 21, 2018
License

https://ega-archive.org/dacs/EGAC00001001008https://ega-archive.org/dacs/EGAC00001001008

Description

The Whole Exome Sequencing dataset contains 30 whole exome sequencing files (tumor, germ line DNA) and phenotype metadata for 15 patients on the phase II clinical trial of neoadjuvant immune checkpoint blockade in high-risk resectable melanoma at MD Anderson Cancer Center (NCT02519322). Included are data from baseline samples.

Search
Clear search
Close search
Google apps
Main menu