Saved datasets
Last updated
Download format
Usage rights
License from data provider
Please review the applicable license to make sure your contemplated use is permitted.
Topic
Free
Cost to access
Described as free to access or have a license that allows redistribution.
17 datasets found
  1. Challenging Medically-Relevant Genes Benchmark Set

    • data.nist.gov
    Updated Sep 29, 2021
  2. f

    lra-supplemental-HG002-SV.vcf.tar.gz

    • figshare.com
    application/x-gzip
    Updated Nov 15, 2020
  3. z

    Assembly of human HG002 (GM24385) ONT Q20+ Simplex dataset generated by...

    • zenodo.org
    gz
    Updated Nov 26, 2021
  4. f

    Performance of deletion calls for HG002.

    • figshare.com
    • plos.figshare.com
    xls
    Updated Mar 27, 2020
  5. z

    PopDel identifies medium-size deletions jointly in tens of thousands of...

    • zenodo.org
    tar
    Updated Aug 20, 2020
  6. o

    Minigraph pangenome graphs for HPRC year-1 samples

    • explore.openaire.eu
    • zenodo.org
    Updated Mar 11, 2022
  7. f

    Heuristics used to determine HG002 genotypes.

    • figshare.com
    xls
    Updated Jul 1, 2020
  8. A public-private-academic consortium hosted by NIST to develop reference...

    • search.datacite.org
    Updated 2015
  9. o

    Sample graphs and sequences for testing sequence-to-graph alignment

    • explore.openaire.eu
    • zenodo.org
    Updated Dec 8, 2021
  10. o

    Open Genomes Telomere-to-Telomere (T2T) Reference Realignment Project

    • www.omicsdi.org
  11. z

    Sample graphs and sequences for testing sequence-to-graph alignment

    • zenodo.org
    • explore.openaire.eu
    agc, gz, log
    Updated Feb 5, 2022
  12. z

    Minigraph pangenome graphs for HPRC year-1 samples

    • zenodo.org
    gz, log, tgz, txt
    Updated Apr 27, 2022
  13. f

    Additional file 3: of Comprehensive evaluation of structural variation...

    • figshare.com
    • springernature.figshare.com
    xlsx
    Updated Jun 4, 2019
  14. z

    Data from: SVXplorer: three-tier approach to identification of structural...

    • zenodo.org
    gz
    Updated Feb 3, 2020
  15. z

    Long reads training material for 'Quality Control' tutorial (Galaxy Training...

    • zenodo.org
    • explore.openaire.eu
    gz, txt
    Updated Nov 23, 2021
  16. Table_1_stLFRsv: A Germline Structural Variant Analysis Pipeline Using...

    • frontiersin.figshare.com
    xlsx
    Updated Mar 18, 2021
  17. f

    Additional file 1 of ECNano: A cost-effective workflow for target enrichment...

    • figshare.com
    xlsx
    Updated Mar 5, 2022
  18. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
National Institute of Standards and Technology (2021). Challenging Medically-Relevant Genes Benchmark Set [Dataset]. http://doi.org/10.18434/mds2-2475
Organization logo

Challenging Medically-Relevant Genes Benchmark Set

3 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Sep 29, 2021
Dataset provided by
National Institute of Standards and Technologyhttp://nist.gov/
License

https://www.nist.gov/open/licensehttps://www.nist.gov/open/license

Description

CMRG v1.00 of a small variant benchmark and structural variant benchmark focused on 273 challenging medically relevant genes for the Genome in a Bottle (GIAB) sample HG002 (aka Ashkenazi son). These benchmarks were generated from a trio-based hifiasm v0.11 (https://doi.org/10.1038/s41592-020-01056-5) diploid assembly of HG002 using PacBio HiFi reads for HG002 for assembly and partitioning into phased haplotypes using Illumina reads for the parents, HG003 and HG004. This benchmark contains vcfs for small and structural variants along with corresponding benchmark bed files indicating regions that are homozygous reference if they do not have a variant in the vcf. We extensively curated the variant calls, excluding any found to be questionable or errors. This benchmark helps measure performance in important challenging regions, including challenging segmental duplications, regions with complex variants, regions with structural variants, and regions affected by false duplications in GRCh37 or GRCh38. This benchmark is described in https://doi.org/10.1101/2021.06.07.444885.

Search
Clear search
Close search
Google apps
Main menu