73 datasets found

Concordance of genotypes represented in VCF and gVCF files with those...
plos.figshare.com
datasetcatalog.nlm.nih.gov
+1more
xls
Updated Jun 1, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alberto Ferrarini; Luciano Xumerle; Francesca Griggio; Marianna Garonzi; Chiara Cantaloni; Cesare Centomo; Sergio Marin Vargas; Patrick Descombes; Julien Marquis; Sebastiano Collino; Claudio Franceschi; Paolo Garagnani; Benjamin A. Salisbury; John Max Harvey; Massimo Delledonne (2023). Concordance of genotypes represented in VCF and gVCF files with those detected by the MI RISK Plus kit. [Dataset]. http://doi.org/10.1371/journal.pone.0132180.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0132180.t001
Dataset updated
Jun 1, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Alberto Ferrarini; Luciano Xumerle; Francesca Griggio; Marianna Garonzi; Chiara Cantaloni; Cesare Centomo; Sergio Marin Vargas; Patrick Descombes; Julien Marquis; Sebastiano Collino; Claudio Franceschi; Paolo Garagnani; Benjamin A. Salisbury; John Max Harvey; Massimo Delledonne
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Concordance of genotypes represented in VCF and gVCF files with those detected by the MI RISK Plus kit.
E
NIHR BioResource Rare Diseases WGS project - Hypertrophic Cardiomyopathy...
ega-archive.org
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NIHR BioResource Rare Diseases WGS project - Hypertrophic Cardiomyopathy (HCM) Rare Disease domain (VCF data) [Dataset]. https://www.ega-archive.org/datasets/EGAD00001007885
Explore at:
License
https://ega-archive.org/dacs/EGAC00001000259https://ega-archive.org/dacs/EGAC00001000259
Description
Short read whole genome sequencing (WGS) VCF files for the NIHR BioResource Rare Diseases WGS project – Participants from the Hypertrophic Cardiomyopathy (HCM) Rare Disease domain
Genotyping of known SNPs from ClinVar using the VCF and gVCF file formats...
plos.figshare.com
datasetcatalog.nlm.nih.gov
xls
Updated Jun 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alberto Ferrarini; Luciano Xumerle; Francesca Griggio; Marianna Garonzi; Chiara Cantaloni; Cesare Centomo; Sergio Marin Vargas; Patrick Descombes; Julien Marquis; Sebastiano Collino; Claudio Franceschi; Paolo Garagnani; Benjamin A. Salisbury; John Max Harvey; Massimo Delledonne (2023). Genotyping of known SNPs from ClinVar using the VCF and gVCF file formats and the number of homozygous reference sites and no-calls based on WGS data. [Dataset]. http://doi.org/10.1371/journal.pone.0132180.t004
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0132180.t004
Dataset updated
Jun 3, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Alberto Ferrarini; Luciano Xumerle; Francesca Griggio; Marianna Garonzi; Chiara Cantaloni; Cesare Centomo; Sergio Marin Vargas; Patrick Descombes; Julien Marquis; Sebastiano Collino; Claudio Franceschi; Paolo Garagnani; Benjamin A. Salisbury; John Max Harvey; Massimo Delledonne
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Genotyping of known SNPs from ClinVar using the VCF and gVCF file formats and the number of homozygous reference sites and no-calls based on WGS data.
m
SARS-CoV-2 GISAID isolates (2020-06-17) genotyping VCF
data.mendeley.com
Updated Jul 25, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Doğa Eskier (2020). SARS-CoV-2 GISAID isolates (2020-06-17) genotyping VCF [Dataset]. http://doi.org/10.17632/63t5c7xb4c.1
Explore at:
Unique identifier
https://doi.org/10.17632/63t5c7xb4c.1
Dataset updated
Jul 25, 2020
Authors
Doğa Eskier
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
VCF file containing filtered mutated sites in SARS-CoV-2 genomes obtained from GISAID EpiCoV, separated by individual mutations. The columns correspond to viral genome accession ID, nucleotide position in the genome, mutation ID (left blank in all rows), reference nucleotide, identified mutation, quality, filter, and information columns (all left blank), format (GT in all rows), column corresponding to reference genome (all 0, referring to reference nucleotide column), and columns corresponding to isolate genomes, with each row identifying the nucleotide in the POS column, and whether it is non-mutant (0), or the mutant indicated in the identified mutation column (1). The file is tab delimited, with 22546 rows including the names, and 30690 columns.

The file was generated to test the hypothesis whether the five most common mutations in the SARS-CoV-2 genome replication complex proteins, nsps 7, 8, 12, and 14, significantly affect the mutation density of the virus over time and whether these affect the synonymous and nonsynonymous mutation densities differently. We discovered that mutations in nsp14, an exonuclease with error correcting capabilities, are most likely to be correlated with increased mutational load across the genome compared to wildtype SARS-CoV-2. These results were obtained by identifying the frequency of mutations across all isolates in genomic regions of interest, analyzing which of the twenty mutations (five per nsp) have a statistically meaningful relationship with the mutation density in the M and E genes (chosen due to being under little selective pressure), and identifying the synonymous and nonsynonymous genomic SNV density for isolates with any of the statistically meaningful mutations, as well as isolates with none of the identified mutations.
Z
NA12878 WES Benchmark dataset
data.niaid.nih.gov
zenodo.org
Updated May 31, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pranckeviciene Erinija (2020). NA12878 WES Benchmark dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3597726
Explore at:
Dataset updated
May 31, 2020
Dataset provided by
Vilnius University
Authors
Pranckeviciene Erinija
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset makes available the UCSC Genome Browser (genome.ucsc.edu) GRCh37 genome build public session NA12878 WES Benchmark files in a single dataset so that these files can be used in other applications or genome browsers such as IGV. All genomic variant calls in all VCF files were decomposed and normalized with vt. This dataset contains:

Genome in a bottle (GIAB) version 3.3.2 high confidence (HC) variant calls and genomic regions for HapMap individual NA12878 :

GIAB_v3.3.2_NA12878-decomposed-normalized.vcf.gz

GIAB_v3.3.2_NA12878-decomposed-normalized.vcf.gz.tbi

GIAB_v3.3.2_NA12878_HC_regions.bed

HapMap individual NA12878 WES variant calls (VCF) and capture regions (BED) from diagnostic laboratories :

ARUP whole exome sequencing data (HiSeq 2000) publically available from NCBI GeT-RM Browser

converted_ARUP_NA12878_Exome-decomposed-normalized.vcf.gz

converted_ARUP_NA12878_Exome-decomposed-normalized.vcf.gz.tbi

ARUP_SeqCap_EZ_Exome.bed

UCSF whole exome sequencing data (HiSeq 2500) publically available from NCBI GeT-RM Browser

converted_UCSF_NA12878_WES_Agilent_V4_Custom-decomposed-normalized.vcf.gz

converted_UCSF_NA12878_WES_Agilent_V4_Custom-decomposed-normalized.vcf.gz.tbi

UCSF_WES_Agilent_V4_Custom.bed

Whole exome data (NextSeq 500) sequenced in CHEO diagnostic laboratory

CHEO_NA12878_WES_S1dataset.vcf.gz

CHEO_NA12878_WES_S1dataset.vcf.gz.tbi

Agilent_CRE_v2.bed

Genomic coordinates (BED) of OMIM genes for which a molecular basis of the associated disease is known (as of September 2019) :

Omim_Genes.bed
Genotyping of GWAS catalog sites using the VCF and gVCF file formats and the...
plos.figshare.com
datasetcatalog.nlm.nih.gov
xls
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alberto Ferrarini; Luciano Xumerle; Francesca Griggio; Marianna Garonzi; Chiara Cantaloni; Cesare Centomo; Sergio Marin Vargas; Patrick Descombes; Julien Marquis; Sebastiano Collino; Claudio Franceschi; Paolo Garagnani; Benjamin A. Salisbury; John Max Harvey; Massimo Delledonne (2023). Genotyping of GWAS catalog sites using the VCF and gVCF file formats and the number of homozygous reference sites and no-calls based on WGS data. [Dataset]. http://doi.org/10.1371/journal.pone.0132180.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0132180.t002
Dataset updated
Jun 1, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Alberto Ferrarini; Luciano Xumerle; Francesca Griggio; Marianna Garonzi; Chiara Cantaloni; Cesare Centomo; Sergio Marin Vargas; Patrick Descombes; Julien Marquis; Sebastiano Collino; Claudio Franceschi; Paolo Garagnani; Benjamin A. Salisbury; John Max Harvey; Massimo Delledonne
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Genotyping of GWAS catalog sites using the VCF and gVCF file formats and the number of homozygous reference sites and no-calls based on WGS data.
d
Annotated VCF of 192 Verticillium dahliae isolates
search.dataone.org
data.niaid.nih.gov
+1more
Updated Jul 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Benjamin Mimee; Joel Lafond-Lapalme; Mario Tenuta (2025). Annotated VCF of 192 Verticillium dahliae isolates [Dataset]. http://doi.org/10.5061/dryad.g79cnp5v0
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.g79cnp5v0
Dataset updated
Jul 16, 2025
Dataset provided by
Dryad Digital Repository
Authors
Benjamin Mimee; Joel Lafond-Lapalme; Mario Tenuta
Time period covered
Jan 1, 2023
Description
Verticillium dahliae is an important soil-borne pathogen causing Verticillium wilt. It is also the primary causal agent of the Potato Early Dying, a disease complex involving the root-lesion nematode. Here, we report the whole-genome sequencing of 192 isolates of V. dahliae originating from the major potato production areas across Canada. Our results yielded a resource ofÂ 277,010 genetic variationsÂ that will be useful for genetic analyses and revealed the presence of two major lineages, both present in all provinces but exhibiting differences in regional prevalence., Filtered WGS reads (fastp) aligned on Verticillium dahliae reference (https://www.ncbi.nlm.nih.gov/assembly/95341/GCA_000150675.2) with BWA. VCF called with freebayes v1.3.6 and annotated with snpeff.,
Data from: A genome-guided strategy for climate resilience in American...
agdatacommons.nal.usda.gov
data.niaid.nih.gov
+1more
bin
Updated Aug 20, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Sandercock; Jared Westbrook; Qian Zhang; Jason Holliday (2025). Data from: A genome-guided strategy for climate resilience in American chestnut restoration populations [Dataset]. http://doi.org/10.5281/zenodo.10676843
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10676843
Dataset updated
Aug 20, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Alexander Sandercock; Jared Westbrook; Qian Zhang; Jason Holliday
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The American chestnut (Castanea dentata) is a functionally extinct tree species that was decimated by an invasive fungal pathogen in the early 20th century. An understanding of the genomic architecture of local adaptation in wild American chestnut was necessary in order to deploy locally adapted, disease-resistant American chestnut populations. Here, we characterize the genomic basis of climate adaptation in remnant wild American chestnut, develop new computational methods, and evaluate the adaptive genomic content captured within backcross breeding populations. Whole genome re-sequencing data of 356 trees from Sandercock et al. (2022) coupled with genotype-environment association methods identified 18483 climate associated loci.Methods: VCF file: The ~21 million SNP dataset from Sandercock et al. (2022) was first imputed using BEAGLE and filtered to remove SNPs with MAF < 0.05. Climate associated loci were then identified using RDA and LFMM2 genotype-environment association methods. Seed zone shape files: Three seed zones were identified using the ~18k climate associated loci. These regions partition the chestnut range into geographic seed zones that reflect relatively homogeneous areas with respect to multivariate adaptive genomic variation. These regions can be used to conserve germplasm ex situ and guide subsequent breeding crosses that lead to climate-matched restoration populations. gmbigxhorn.jtl.map.2022.csv is a genetic map generated from American chestnut backcross genotyping-by-sequencing data. R code for estimating the average migration distance for each seed zone under future climate change conditions.
n
PhenoDB
neuinfo.org
scicrunch.org
+2more
Updated Oct 23, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). PhenoDB [Dataset]. http://identifiers.org/RRID:SCR_016551
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_016551
Dataset updated
Oct 23, 2024
Description
Database for phenotype genotype associations for humans. Used by clinical researchers to store standardized phenotypic information, diagnosis, and pedigree data and then run analyses on VCF files from individuals, families or cohorts with suspected Mendelian disease.
d
Unraveling the genetics of feline hypertrophic cardiomyopathy: A multiomics...
search.dataone.org
datadryad.org
Updated Jun 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Michael Vandewege; Joanna Kaplan; Victor Rivas; Jalena Wouters; Samantha Harris; Kathryn Meurs; Joshua Stern (2025). Unraveling the genetics of feline hypertrophic cardiomyopathy: A multiomics study of 138 cats [Dataset]. http://doi.org/10.5061/dryad.cjsxksnjh
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.cjsxksnjh
Dataset updated
Jun 27, 2025
Dataset provided by
Dryad Digital Repository
Authors
Michael Vandewege; Joanna Kaplan; Victor Rivas; Jalena Wouters; Samantha Harris; Kathryn Meurs; Joshua Stern
Description
Hypertrophic cardiomyopathy (HCM) is the most common inherited cardiac disease in cats, often leading to congestive heart failure, arterial thromboembolism, and sudden cardiac death. The genetics of feline HCM are poorly understood, and limited genetic discoveries remain breed or family-specific. We aimed to identify novel causative or disease-modifying variants in a large cohort of cats reflective of the general cat population. In a second cohort, we sought to characterize transcriptomic differences between HCM-affected cats and healthy controls. DNA was isolated from 138 domestic cats (109 HCM and 29 controls). No single or combination of variants of high, moderate, or modifying impact were identified in genome-wide analysis to cause or modify the disease severity of HCM. Several rare high and moderate-impact variants in genes associated with human HCM were detected in diseased cats. In a second cohort, left ventricular (LV), interventricular septal (IVS), and left atrial (LA) tissues..., WGS data generation A total of 1-2 mL of whole blood were collected from the cephalic, saphenous, or jugular vein into EDTA blood collection tubes. DNA was either isolated from whole blood or from buffy coats after whole blood centrifugation at 2000 rpm for 15 minutes. Genomic DNA isolation was performed using commercially available kits (Gentra Puregene Blood kit, QIAGEN, Hilden Germany; ArchivePure;5Prime) and by following the respective manufacturerâ€™s protocol. High-quality unfragmented DNA was selected by a combination of 1% agarose gel visualization and spectrophotometric confirmation (a 260/280 ratio of ~1.8 and a concentration of > 50 ng/uL; NanoDrop One/One, Thermofisher, Waltham, GA, USA). Samples were stored at -20Â°C until ready for shipment to Theragen Bio Co., Ltd, Gyeonggi-do, Republic of Korea for WGS. Paired-end DNA libraries were generated with a TruSeq DNA Nano library prep kit. Samples were then pooled and sequenced at ~30x coverage on the Illumina NovaSeq6000 platf..., # Unraveling the genetics of feline hypertrophic cardiomyopathy: A multiomics study of 138 cats

Dataset DOI: 10.5061/dryad.cjsxksnjh

Description of the data and file structure

Data available

1. A population level vcf of polymorphic SNP and indel variants were called among 138 domestic cats with and without hypertrophic cardiomyopathy (HCM). The VCF was generated by mapping paired wgs fastq reads to the Fca126 reference genome with bwa mem and calling variants through GATK4 best practices. Variant annotations were generated with Ensembl's VEP based on Fca126 gene and exon boundaries.Â Â The vcf file contains meta-information lines, followed by a header line specifying fixed fields per sample and subsequent data lines detail variants at genomic positions. The fixed fields include chromosome (CHROM), position (POS), identifier (ID), the reference base(s) (REF), alternate base(s) (ALT), quality (QUAL), filter status (FILTER), and additional information ...,
d
Replication data for: Genetic analyses for the response to Bean Leaf Crumple...
search.dataone.org
dataverse.harvard.edu
Updated Nov 9, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Ariza-Suarez; Beat Keller; Anna Spescha; Johan Steven Aparicio; Victor Mayor; Ana Elizabeth Portilla-Benavides; Hector Fabio Buendia; Juan Miguel Bueno; Bruno Studer; Bodo Raatz (2023). Replication data for: Genetic analyses for the response to Bean Leaf Crumple Virus (BLCrV) identify a candidate LRR-RLK gene [Dataset]. http://doi.org/10.7910/DVN/9JSMED
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/9JSMED
Dataset updated
Nov 9, 2023
Dataset provided by
Harvard Dataverse
Authors
Daniel Ariza-Suarez; Beat Keller; Anna Spescha; Johan Steven Aparicio; Victor Mayor; Ana Elizabeth Portilla-Benavides; Hector Fabio Buendia; Juan Miguel Bueno; Bruno Studer; Bodo Raatz
Time period covered
Jan 1, 2013 - Jan 1, 2020
Description
These datasets contain phenotypic and genotypic data from three connected populations of common bean (Phaseolus vulgaris L.) that were used to identify the genomic regions controlling the phenotypic response to Bean Leaf Crumple Virus (BLCrV). The first is the Andean by Meso (AxM) population, which contains 190 individuals derived from bi-parental crosses between Andean and Mesoamerican breeding lines. The AxM population included 120 additional breeding lines of Andean and Mesoamerican origin that were used as checks for their response against other viral diseases, such as Bean Golden Yellow Mosaic Virus (BGYMV). The second is a pre-breeding population (termed P135-136) composed of 111 lines that was obtained from two-way and three-way crosses between elite Andean lines and resistant sources against viral diseases. The third population is a panel of 186 Mesoamerican breeding lines assembled from a collection of elite materials from the Mesoamerican breeding pipeline at CIAT. The AxM population was evaluated in three yield trials in Palmira (Colombia)between 2013 and 2015 for flowering, maturity time and yield. All three population were evaluated in three BLCrV trials in Pradera (Colombia), where the disease pressure is naturally high. The AxM and the Mesoamerican panel were genotyped by sequencing (GBS), and these datasets contain their corresponding genotypic matrices in variant-call format (VCF, v4.2) with sequence variants mapped against the reference genome of P. vulgaris (G19833, v2.1). A joint genotypic matrix with all available GBS data from these three populations is also included. The population P135-136 was genotyped with the DArTag targeted genotyping service offered by Diversity Arrays Technology (DArT PL, Bruce ACT, Australia), and the genotypic matrix is similarly included in VCF format.
Raw VCF files
figshare.com
application/gzip
Updated Apr 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Christina Cuomo (2023). Raw VCF files [Dataset]. http://doi.org/10.6084/m9.figshare.12693881.v1
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12693881.v1
Dataset updated
Apr 13, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Christina Cuomo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
VCF files submitted for each group/pipeline.
d
Climate adaptation and genetic differentiation in the mosquito species Culex...
search.dataone.org
datasetcatalog.nlm.nih.gov
+3more
Updated Sep 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yunfei Liao; Touhid Islam; Rooksana Noorai4; Jared Streich; Christopher Saski; Lee Cohnstaedt; Elizabeth Cooper (2025). Climate adaptation and genetic differentiation in the mosquito species Culex tarsalis [Dataset]. http://doi.org/10.5061/dryad.51c59zwh3
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.51c59zwh3
Dataset updated
Sep 26, 2025
Dataset provided by
Dryad Digital Repository
Authors
Yunfei Liao; Touhid Islam; Rooksana Noorai4; Jared Streich; Christopher Saski; Lee Cohnstaedt; Elizabeth Cooper
Description
The increasing prevalence of vector-borne diseases around the world highlights the pressing need for an in-depth exploration of the genetic and environmental factors that shape the adaptability and widespread distribution of mosquito populations. This research focuses on Culex tarsalis, a principal vector for various viral diseases, including West Nile Virus (WNV). Through the development of a new reference genome and the examination of Restriction-Site Associated DNA sequencing (RAD-seq) data from over 300 individuals and 28 locations, we demonstrate that variables such as temperature, evaporation rates, and the density of vegetation significantly impact the genetic makeup of Cx. tarsalis populations. Among the alleles most strongly associated with environmental factors is a nonsynonymous mutation in a key gene related to circadian rhythms. These results offer new insights into the mechanisms of spread and adaptation in a key North American vector species, which is poised to beco..., Sample Collection Individual mosquitoes were trapped and collected from 28 different locations across the United States and Canada as part of the North American Mosquito Project (NAMP). All samples used in this study were collected in 2012 between the months of April and October. Genome Sequencing, Assembly, and Annotation An F4 population was used to generate the reference genome assembly, and high molecular weight DNA was extracted and sequenced on a Pacific Biosciences (PacBio) RS II (University of Delaware). Thirty-five SMRT cells were generated. The resulting reads provided 76X coverage of the ~790Mb Cx. tarsalis genome, and were assembled with MECAT. Gene annotation was completed by MAKER using EST and protein data from the Culex quinquefasciatus and Aedes aegypti mosquitoes. Sequences were downloaded from the NCBI Taxonomy database and both Trinotate and InterProScan were used for functional annotation of the MAKER predicted genes. The annotated assembly was ass..., , # Climate adaptation and genetic differentiation in the mosquito species Culex tarsalis

https://doi.org/10.5061/dryad.51c59zwh3

Description of the data and file structure

The data were stored in 8 different files.

bi_20missing_filtSNP_maf_005.recode.vcf

File Details

File Name: bi_10missing_filtSNP_maf_005.recode.vcf

File Format: VCF (Variant Call Format) v4.2

Reference genome: culex.60x.contigs.fasta (from header ##reference=...)

Source software (inferred): GATK-style headers present (e.g., QD, MQ, FS, ReadPosRankSum, RGQ, PL, SB, END, NON_REF) suggest generation with GATK/HaplotypeCaller (gVCFâ†’VCF workflow).

Samples: 7 individuals â†’ 13-2, 13-3, 13-4, 13-5, 13-6, 13-7, 13-8

Data Description

Â -The VCF contains standard columns: * #CHROM â€“ Contig/chromosome * POS â€“ 1-based position * ID â€“ Variant identifier * REF â€“ Reference allele * ALT â€“ Alternate allele...,
E
Raw data (FASTQ) and processed data (VCF) of 7 patient-derived Sézary...
ega-archive.org
Updated Sep 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Raw data (FASTQ) and processed data (VCF) of 7 patient-derived Sézary Syndrome (SS) cells [Dataset]. https://ega-archive.org/datasets/EGAD50000001646
Explore at:
Dataset updated
Sep 1, 2025
License
https://ega-archive.org/dacs/EGAC50000000693https://ega-archive.org/dacs/EGAC50000000693
Description
We profile the whole-transcriptome (bulk RNAseq) of 7 patient-derived Sézary Syndrome (SS) cells to identify expression patterns, functional programs and expressed gene mutations that may provide clues on new therapeutic options for SS patients. The libraries were sequenced on NextSeq500 (Illumina) with a paired-end read length of 2x75bp. Raw data (FASTQ) and obtained processed data (VCF) including all called raw variants are available.
Data from: Divergent selection in low recombination regions shapes the...
zenodo.org
data.niaid.nih.gov
+1more
application/gzip, bin
Updated Dec 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wenjun Zhou; Wenjun Zhou (2023). Divergent selection in low recombination regions shapes the genomic islands in two incipient shorebird species [Dataset]. http://doi.org/10.5061/dryad.4f4qrfjjp
Explore at:
bin, application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.4f4qrfjjp
Dataset updated
Dec 15, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Wenjun Zhou; Wenjun Zhou
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Divergent selection in the face of gene flow is usually associated with a heterogeneous genomic landscape of divergence in nascent species pairs. However, multiple factors, such as divergent selection and local recombination rate variation, can influence the formation of these genomic island. This conundrum can be solved through examination of the genomic landscapes of species pairs that are still in the early stages of evolution. In this study, population genomics analyses were undertaken using a wide range of sampling and whole-genome resequencing data from 96 unrelated individuals of Kentish plover (Charadrius alexandrinus) and white-faced plover (C. dealbatus). We suggest that the two species exhibit varying levels of population admixture along the Chinese coast and on Taiwan Island. Genome-wide analyses for introgression indicate that ancient introgression had occurred in Taiwan population, and recurrent gene flow is still ongoing in mainland coastal populations. Furthermore, we identified a few genomic regions with significant levels of interspecific differentiation and local recombination suppression, which contain several genes potentially associated with disease resistance, coloration, and regulation of plumage moulting, thus may be connected to the phenotypic and ecological divergence of the two nascent species. Overall, our findings suggest that divergent selection in low recombination regions may be the main force in shaping the genomic islands in two incipient shorebird species.
Comparison of the number of dbSNP, ClinVar and GWAScat sites represented...
plos.figshare.com
xls
Updated Jun 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alberto Ferrarini; Luciano Xumerle; Francesca Griggio; Marianna Garonzi; Chiara Cantaloni; Cesare Centomo; Sergio Marin Vargas; Patrick Descombes; Julien Marquis; Sebastiano Collino; Claudio Franceschi; Paolo Garagnani; Benjamin A. Salisbury; John Max Harvey; Massimo Delledonne (2023). Comparison of the number of dbSNP, ClinVar and GWAScat sites represented using VCF, gVCF and eVCF files. [Dataset]. http://doi.org/10.1371/journal.pone.0132180.t007
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0132180.t007
Dataset updated
Jun 3, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Alberto Ferrarini; Luciano Xumerle; Francesca Griggio; Marianna Garonzi; Chiara Cantaloni; Cesare Centomo; Sergio Marin Vargas; Patrick Descombes; Julien Marquis; Sebastiano Collino; Claudio Franceschi; Paolo Garagnani; Benjamin A. Salisbury; John Max Harvey; Massimo Delledonne
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Comparison of the number of dbSNP, ClinVar and GWAScat sites represented using VCF, gVCF and eVCF files.
d
Data from: Acquired dysfunction of CFTR underlies cystic fibrosis-like...
search.dataone.org
datasetcatalog.nlm.nih.gov
+2more
Updated Jul 20, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jody Gookin (2024). Acquired dysfunction of CFTR underlies cystic fibrosis-like disease of the canine gallbladder [Dataset]. http://doi.org/10.5061/dryad.2rbnzs7xq
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.2rbnzs7xq
Dataset updated
Jul 20, 2024
Dataset provided by
Dryad Digital Repository
Authors
Jody Gookin
Description
Mucocele formation in dogs is a unique and enigmatic muco-obstructive disease of the gallbladder caused by amassment of abnormal mucus that bears striking pathological similarity to cystic fibrosis. We investigated the role of CFTR in the pathogenesis of this disease. The location and frequency of disease-associated variants in the coding region of CFTR was compared using whole genome sequence data from 2,642 dogs representing breeds at low-risk, high-risk, or with confirmed disease. Expression, localization, and ion transport activity of CFTR was quantified in control and mucocele gallbladders by NanoString, Western blotting, immunofluorescence imaging, and studies in Ussing chambers. Our results establish significant loss of CFTR-dependent anion secretion by mucocele gallbladder mucosa. A significantly lower quantity of CFTR protein was demonstrated relative to E-cadherin in mucocele compared to control gallbladder mucosa. Immunofluorescence identified CFTR along the apical membrane o..., We used the Whole Animal Genome Sequencing (WAGS) pipeline to identify short nucleotide variants in a dataset of 2,642 dogs encompassing both private and public resources including 1,971 genomes from the Dog10K project. Briefly, the WAGS pipeline used Burrows-Wheeler Alignment tool-MEM to map paired-end reads to the UU_Cfam_GSD_1.0 reference genome. Variant calling was executed with Genome Analysis Toolkit (GATK4), and Ensemblâ€™s Variant Effect Predictor (VEP, RRID:SCR_007931) predicted variant annotations and consequences. From the resulting VEP-processed VCF file, we extracted CFTR genic variants plus variants within 1Kb of the flanking sequence that passed filters. Subsequently, non-reference allele frequencies were calculated for each variant within the control, risk, and affected dog groups.Â , , # Acquired dysfunction of CFTR underlies cystic fibrosis-like disease of the canine gallbladder.

https://doi.org/10.5061/dryad.2rbnzs7xq

This dataset includes supplementary materials for the manuscript entitled Acquired dysfunction of CFTR underlies cystic fibrosis-like disease of the canine gallbladder.

Description of the data and file structure

Supplemental Figure S1 illustrates sample procurement and appearance of gallbladder from each of 9 dogs having mucosal RNA extracted for targeted gene expression analysis. Samples of lumen mucosa were obtained by excision from regions devoid of mucus or from which mucus could be gently removed. During sampling (panel A) and after removal of sample (panel B). Remaining panels show each of 9 individual mucocele gallbladders used for mucosal RNA sample collection. Pictures are immediately post-cholecystectomy followed by opening of the gallbladder to expose the lumen.

**Supplemental Table S1...
d
Genotypes of Aedes aegypti mosquitoes derived from SNP chip and low-coverage...
search.dataone.org
dataone.org
+3more
Updated Oct 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andres Gomez-Palacio; Gen Morinaga; Paul Turner; Maria Victoria Micieli; Mohammed-Ahmed Elnour; Bashir Salim; Sinnathamby Noble Surendran; Ranjan Ramasamy; Jeffrey Powell; John Soghigian; Andrea Gloria-Soria (2025). Genotypes of Aedes aegypti mosquitoes derived from SNP chip and low-coverage whole genome sequencing for platform cross-validation [Dataset]. http://doi.org/10.5061/dryad.m0cfxppbd
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.m0cfxppbd
Dataset updated
Oct 15, 2025
Dataset provided by
Dryad Digital Repository
Authors
Andres Gomez-Palacio; Gen Morinaga; Paul Turner; Maria Victoria Micieli; Mohammed-Ahmed Elnour; Bashir Salim; Sinnathamby Noble Surendran; Ranjan Ramasamy; Jeffrey Powell; John Soghigian; Andrea Gloria-Soria
Description
The mosquito Aedes aegypti is the primary vector of many human arboviruses such as dengue, yellow fever, chikungunya, and Zika, which affect millions of people world-wide. Population genetics studies on this mosquito have been important in understanding its invasion pathways and success as a vector of human disease. The Axiom aegypti1 SNP chip was developed from a sample of geographically diverse Ae. aegypti populations to facilitate genomic studies on this species. Here we evaluate the utility of the Axiom aegypti1 SNP chip for population genetics and compare it with a low-depth shot-gun sequencing approach using mosquitoes from the speciesâ€™ native (Africa) and invasive range (outside Africa). These analyses indicate that the results from the SNP chip are highly reproducible and have a higher sensitivity to capture alternative alleles than a low-coverage whole-genome sequencing approach. Although the SNP chip suffers from ascertainment bias, results from population structure, ancestry,..., DNA from individual Aedes aegypti mosquitoes was extracted and used for genotyping at 50,000 loci distributed along the species genome, using the Axiom Aegypti1 SNP chip (Life Technologies Corporation CAT#550481). Files "all_snps_G3Dryad" and "Replicas_SNPchip" contain all 50,000 SNPs genotyped, prior to filtering. File "50k_SNPs_30_samples_LD_MAF_miss_FINAL" contain the SNPs after applying filters in Plink 1.9 (https://www.cog-genomics.org/plink/) for linkage disequilibrium (LD: -indep-pairwise 50 10 0.3), minor allele frequency (MAF: -maf 0.1) and missing data (-geno 0.1)., , # Genotypes of Aedes aegypti mosquitoes derived from SNP chip and low-coverage whole genome sequencing for platform cross-validation

https://doi.org/10.5061/dryad.m0cfxppbd

Files: Replicas_SNPchip

SNP chip data generated from 20 individual Aedes aegypti mosquitos from Sudan and Sri Lanka using the Axiom Aegypti1 array (Life Technologies Corporation CAT#550481) . Each mosquito was genotyped in triplicate independently in different chips. All 50,000 loci genotyped are included, prior to any filtering.

Files: all_snps_G3Dryad

SNP chip data generated from 13 individual Aedes aegypti mosquitos from populations worldwide using the Axiom Aegypti1 array (Life Technologies Corporation CAT#550481). All 50,000 loci genotyped are included, prior to any filtering.

File: 50k_SNPs_30_samples_LD_MAF_miss_FINAL.vcf.gz

Variant calling file (vcf) containing 30 individual *Aedes aegypti *genotypes used for population genetic analysis, with five individu...,
d
Data from: Genome-wide association study of an unusual dolphin mortality...
datadryad.org
data.niaid.nih.gov
+2more
zip
Updated Nov 29, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kimberley C. Batley; Jonathan Sandoval-Castillo; Catherine M. Kemper; Catherine R.M. Attard; Nikki Zanardo; Ikuko Tomo; Luciano B. Beheregaray; Luciana M. Möller (2018). Genome-wide association study of an unusual dolphin mortality event reveals candidate genes for susceptibility and resistance to cetacean morbillivirus [Dataset]. http://doi.org/10.5061/dryad.tk8774f
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.tk8774f
Dataset updated
Nov 29, 2018
Dataset provided by
Dryad
Authors
Kimberley C. Batley; Jonathan Sandoval-Castillo; Catherine M. Kemper; Catherine R.M. Attard; Nikki Zanardo; Ikuko Tomo; Luciano B. Beheregaray; Luciana M. Möller
Time period covered
Nov 29, 2018
Area covered
St. Vincent Gulf, South Australia
Description
Tursiops SNP datasetSNP genotype, vcf file. Mapped to the Tursiops truncatus genome (GCA_001922835.1).mappedQC.fil5.vcfTursiops ref_seqFForward reference sequencesTur_1.fastaTursiops ref_seqRReverse reference sequencesTur_2.fasta
a
TBX1
alliancegenome.org
Updated Apr 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alliance of Genome Resources (2025). TBX1 [Dataset]. http://identifiers.org/HGNC:11592
Explore at:
Unique identifier
https://identifiers.org/HGNC:11592
Dataset updated
Apr 16, 2025
Dataset authored and provided by
Alliance of Genome Resources
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
T-box transcription factor 1 Enables protein homodimerization activity and sequence-specific double-stranded DNA binding activity. Involved in several processes, including chordate embryonic development; parathyroid gland development; and soft palate development. Predicted to be active in chromatin and nucleus. Implicated in several diseases, including DiGeorge syndrome; congenital heart disease (multiple); hypoparathyroidism; sensorineural hearing loss; and velocardiofacial syndrome. Biomarker of congenital heart disease. This gene is a member of a phylogenetically conserved family of genes that share a common DNA-binding domain, the T-box. T-box genes encode transcription factors involved in the regulation of developmental processes. This gene product shares 98% amino acid sequence identity with the mouse ortholog. DiGeorge syndrome (DGS)/velocardiofacial syndrome (VCFS), a common congenital disorder characterized by neural-crest-related developmental defects, has been associated with deletions of chromosome 22q11.2, where this gene has been mapped. Studies using mouse models of DiGeorge syndrome suggest a major role for this gene in the molecular etiology of DGS/VCFS. Several alternatively spliced transcript variants encoding different isoforms have been described for this gene. [provided by RefSeq, Jul 2008]

Facebook

Twitter

Click to copy link

Link copied

Cite

Alberto Ferrarini; Luciano Xumerle; Francesca Griggio; Marianna Garonzi; Chiara Cantaloni; Cesare Centomo; Sergio Marin Vargas; Patrick Descombes; Julien Marquis; Sebastiano Collino; Claudio Franceschi; Paolo Garagnani; Benjamin A. Salisbury; John Max Harvey; Massimo Delledonne (2023). Concordance of genotypes represented in VCF and gVCF files with those detected by the MI RISK Plus kit. [Dataset]. http://doi.org/10.1371/journal.pone.0132180.t001

Concordance of genotypes represented in VCF and gVCF files with those detected by the MI RISK Plus kit.

Explore at:

xlsAvailable download formats

Unique identifier

https://doi.org/10.1371/journal.pone.0132180.t001

Dataset updated

Jun 1, 2023

Dataset provided by

PLOShttp://plos.org/

Authors

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Concordance of genotypes represented in VCF and gVCF files with those detected by the MI RISK Plus kit.

Clear search

Close search

Google apps

Main menu

Concordance of genotypes represented in VCF and gVCF files with those...

NIHR BioResource Rare Diseases WGS project - Hypertrophic Cardiomyopathy...

Genotyping of known SNPs from ClinVar using the VCF and gVCF file formats...

SARS-CoV-2 GISAID isolates (2020-06-17) genotyping VCF

NA12878 WES Benchmark dataset

Genotyping of GWAS catalog sites using the VCF and gVCF file formats and the...

Annotated VCF of 192 Verticillium dahliae isolates

Data from: A genome-guided strategy for climate resilience in American...

PhenoDB

Unraveling the genetics of feline hypertrophic cardiomyopathy: A multiomics...

Description of the data and file structure

Data available

Replication data for: Genetic analyses for the response to Bean Leaf Crumple...

Raw VCF files

Climate adaptation and genetic differentiation in the mosquito species Culex...

Description of the data and file structure

Raw data (FASTQ) and processed data (VCF) of 7 patient-derived Sézary...

Data from: Divergent selection in low recombination regions shapes the...

Comparison of the number of dbSNP, ClinVar and GWAScat sites represented...

Data from: Acquired dysfunction of CFTR underlies cystic fibrosis-like...

Description of the data and file structure

Genotypes of Aedes aegypti mosquitoes derived from SNP chip and low-coverage...

Data from: Genome-wide association study of an unusual dolphin mortality...

TBX1

Concordance of genotypes represented in VCF and gVCF files with those detected by the MI RISK Plus kit.See More Versions

Concordance of genotypes represented in VCF and gVCF files with those detected by the MI RISK Plus kit.