Saved datasets
Last updated
Download format
Usage rights
License from data provider
Please review the applicable license to make sure your contemplated use is permitted.
Topic
Free
Cost to access
Described as free to access or have a license that allows redistribution.
17 datasets found
  1. Challenging Medically-Relevant Genes Benchmark Set

    • data.nist.gov
    Updated Sep 29, 2021
  2. lra-supplemental-HG002-SV.vcf.tar.gz

    • figshare.com
    application/x-gzip
    Updated Nov 15, 2020
  3. z

    Assembly of human HG002 (GM24385) ONT Q20+ Simplex dataset generated by...

    • zenodo.org
    gz
    Updated Nov 26, 2021
  4. z

    The SV callsets of the HG002 human sample produced by cuteSV with multi...

    • zenodo.org
    • search.datacite.org
    vcf
    Updated Oct 9, 2019
  5. f

    Performance of deletion calls for HG002.

    • figshare.com
    • plos.figshare.com
    xls
    Updated Mar 27, 2020
  6. z

    PopDel identifies medium-size deletions jointly in tens of thousands of...

    • zenodo.org
    tar
    Updated Aug 20, 2020
  7. A public-private-academic consortium hosted by NIST to develop reference...

    • search.datacite.org
    Updated 2015
  8. Heuristics used to determine HG002 genotypes.

    • figshare.com
    xls
    Updated Jul 1, 2020
  9. g

    Supporting data for "xAtlas: Scalable small variant calling across...

    • aspera.gigadb.org
    Updated Nov 14, 2022
  10. o

    Open Genomes Telomere-to-Telomere (T2T) Reference Realignment Project

    • omicsdi.org
    Updated Jan 7, 2019
  11. z

    Minigraph pangenome graphs for HPRC year-1 samples

    • zenodo.org
    gz, log, txt
    Updated Feb 25, 2022
  12. z

    Minigraph pangenome graphs for HPRC year-1 samples

    • zenodo.org
    gz, log, tgz, txt
    Updated Apr 27, 2022
  13. Additional file 3: of Comprehensive evaluation of structural variation...

    • figshare.com
    • springernature.figshare.com
    xlsx
    Updated Jun 4, 2019
  14. z

    Sample graphs and sequences for testing sequence-to-graph alignment

    • zenodo.org
    agc, gz, log
    Updated Feb 5, 2022
  15. f

    Table_1_stLFRsv: A Germline Structural Variant Analysis Pipeline Using...

    • frontiersin.figshare.com
    xlsx
    Updated Mar 18, 2021
  16. f

    Additional file 1 of ECNano: A cost-effective workflow for target enrichment...

    • figshare.com
    xlsx
    Updated Mar 5, 2022
  17. z

    Data from: SVXplorer: three-tier approach to identification of structural...

    • zenodo.org
    gz
    Updated Feb 3, 2020
  18. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
National Institute of Standards and Technology (2021). Challenging Medically-Relevant Genes Benchmark Set [Dataset]. http://doi.org/10.18434/mds2-2475
Organization logo

Challenging Medically-Relevant Genes Benchmark Set

Explore at:
3 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Sep 29, 2021
Dataset provided by
National Institute of Standards and Technologyhttp://nist.gov/
License

https://www.nist.gov/open/licensehttps://www.nist.gov/open/license

Description

CMRG v1.00 of a small variant benchmark and structural variant benchmark focused on 273 challenging medically relevant genes for the Genome in a Bottle (GIAB) sample HG002 (aka Ashkenazi son). These benchmarks were generated from a trio-based hifiasm v0.11 (https://doi.org/10.1038/s41592-020-01056-5) diploid assembly of HG002 using PacBio HiFi reads for HG002 for assembly and partitioning into phased haplotypes using Illumina reads for the parents, HG003 and HG004. This benchmark contains vcfs for small and structural variants along with corresponding benchmark bed files indicating regions that are homozygous reference if they do not have a variant in the vcf. We extensively curated the variant calls, excluding any found to be questionable or errors. This benchmark helps measure performance in important challenging regions, including challenging segmental duplications, regions with complex variants, regions with structural variants, and regions affected by false duplications in GRCh37 or GRCh38. This benchmark is described in https://doi.org/10.1101/2021.06.07.444885.

Search
Clear search
Close search
Google apps
Main menu