Database of Genotype and Phenotype (dbGaP) was developed to archive and distribute the data and results from studies that have investigated the interaction of genotype and phenotype in Humans.
Database developed to archive and distribute clinical data and results from studies that have investigated interaction of genotype and phenotype in humans. Database to archive and distribute results of studies including genome-wide association studies, medical sequencing, molecular diagnostic assays, and association between genotype and non-clinical traits.
This dataset tracks the updates made on the dataset "Database of Genotype and Phenotype (dbGaP)" as a repository for previous versions of the data and metadata.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Genome-wide association studies (GWAS) usually rely on the assumption that different samples are not from closely related individuals. Detection of duplicates and close relatives becomes more difficult both statistically and computationally when one wants to combine datasets that may have been genotyped on different platforms. The dbGaP repository at the National Center of Biotechnology Information (NCBI) contains datasets from hundreds of studies with over one million samples. There are many duplicates and closely related individuals both within and across studies from different submitters. Relationships between studies cannot always be identified by the submitters of individual datasets. To aid in curation of dbGaP, we developed a rapid statistical method called Genetic Relationship and Fingerprinting (GRAF) to detect duplicates and closely related samples, even when the sets of genotyped markers differ and the DNA strand orientations are unknown. GRAF extracts genotypes of 10,000 informative and independent SNPs from genotype datasets obtained using different methods, and implements quick algorithms that enable it to find all of the duplicate pairs from more than 880,000 samples within and across dbGaP studies in less than two hours. In addition, GRAF uses two statistical metrics called All Genotype Mismatch Rate (AGMR) and Homozygous Genotype Mismatch Rate (HGMR) to determine subject relationships directly from the observed genotypes, without estimating probabilities of identity by descent (IBD), or kinship coefficients, and compares the predicted relationships with those reported in the pedigree files. We implemented GRAF in a freely available C++ program of the same name. In this paper, we describe the methods in GRAF and validate the usage of GRAF on samples from the dbGaP repository. Other scientists can use GRAF on their own samples and in combination with samples downloaded from dbGaP.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Mean HGMR and AGMR values and correlation coefficients between HGMR and AGMR of all related subjects reported in the data files submitted to dbGaP.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Predicted HGMR values and standard deviations for different types of relationships assuming allele frequencies are evenly distributed between 0.1 and 0.9.
In this study, we address the enormous challenges common complex diseases pose for genomic analysis and the enormous opportunities surmounting them offers for advancing healthcare. The common genetic disorders proposed for study here are believed to have extreme locus heterogeneity, requiring the analysis of large numbers of samples to comprehensively identify the genomic variants underlying them. We propose that a combination of deep population studies and joint analysis of SNPs, indels, and structural variants, both in coding and noncoding regions, will provide the next level of understanding of common genetic disorders. Whole genome sequencing (WGS) will be critical to this next-generation approach to the genomics of complex disease. WGS will need to be accompanied by the technical ability to generate and handle very large data sets, a particular focus and strength of NYGC. WGS will also need to be accompanied by new statistical tools and algorithms... (for more see dbGaP study page.)
The NIMH Repository and Genomics Resource (RGR) stores biosamples, genetic, pedigree and clinical data collected in designated NIMH-funded human subject studies. The RGR database likewise links to other repositories holding data from the same subjects, including dbGAP, GEO and NDAR. The NIMH RGR allows the broader research community to access these data and biospecimens (e.g., lymphoblastoid cell lines, induced pluripotent cell lines, fibroblasts) and further expand the genetic and molecular characterization of patient populations with severe mental illness.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Subtypes are provided for each of the 555 Stage 1 GWAS samples included in the clustering analysis according to their dbGaP SUBJIDs. dbGaP, database of Genotypes and Phenotypes; GWAS, genome-wide association study; SUBJID, subject ID. (TXT)
Supports finding human phenotype/genotype relationships with queries by phenotype, chromosome location, gene, and SNP identifiers. Currently includes information from dbGaP, the National Human Genome Research Institute (NHGRI) genome-wide association study (GWAS) Catalog, and Genotype - Tissue Expression (GTeX).
The purpose of this study is to address the key question of whether and how family health history (FHH) is adopted as a tool to more efficiently manage patients at risk for breast, colon, ovarian, and hereditary cancer syndromes as well as thrombophilia and coronary heart disease (CHD) and to provide evidence supporting clinical utility -- improved health behaviors in patients and physician screening recommendations. Five health care delivery organizations will participate in this demonstration project: Duke University, the Medical College of Wisconsin, the Air Force, Essentia Health, University of North Texas. Duke will serve as a coordinating center for this project (Pro00043372) as well as a site. Healthcare Effectiveness Data and Information Set (HEDIS) measures as intermediate clinical effectiveness measures for Coronary Heart Disease (CHD) and selected cancers as well as survey/formative data and electronic medical record (EMR) data will be used... (for more see dbGaP study page.)
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Preprint: Dand N, Stuart PE, et al., GWAS meta-analysis of psoriasis identifies new susceptibility alleles impacting disease mechanisms and therapeutic targets, medRxiv. 2023 Oct 5:2023.10.04.23296543. doi: 10.1101/2023.10.04.23296543. PMID: 37873414Abstract: Psoriasis is a common, debilitating immune-mediated skin disease. Genetic studies have identified biological mechanisms of psoriasis risk, including those targeted by effective therapies. However, the genetic liability to psoriasis is not fully explained by variation at robustly identified risk loci. To refine the genetic map of psoriasis susceptibility we meta-analysed 18 GWAS comprising 36,466 cases and 458,078 controls and identified 109 distinct psoriasis susceptibility loci, including 46 that have not been previously reported. These include susceptibility variants at loci in which the therapeutic targets IL17RA and AHR are encoded, and deleterious coding variants supporting potential new drug targets (including in STAP2, CPVL and POU2F3). We conducted a transcriptome-wide association study to identify regulatory effects of psoriasis susceptibility variants and cross-referenced these against single cell expression profiles in psoriasis-affected skin, highlighting roles for the transcriptional regulation of haematopoietic cell development and epigenetic modulation of interferon signalling in psoriasis pathobiology.This dataset: This study used a custom LD reference panel comprising six GWAS datasets. Individual level genotype data for the CASP GWAS, PsA GWAS, and Exomechip case-control studies are available on dbGaP (dbGaP: phs000019.v1.p1 [https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000019.v1.p1], phs000982.v1.p1 [http://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000982.v1.p1], and phs001306.v1.p1 [http://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001306.v1.p1]), and WTCCC2 genotype data are archived at the European Genome-Phenome Archive (study ID EGAS00000000108 [https://ega-archive.org/studies/EGAS00000000108]). Data sharing restrictions do not allow making genotype data publicly available for the remaining two case-control cohorts. However, LD matrices based on the full reference panel for all 109 susceptibility loci have been deposited in the King’s Open Research Data System.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Database of Genotype and Phenotype (dbGaP) was developed to archive and distribute the data and results from studies that have investigated the interaction of genotype and phenotype in Humans.