Facebook
TwitterDatabase of Genotype and Phenotype (dbGaP) was developed to archive and distribute the data and results from studies that have investigated the interaction of genotype and phenotype in Humans.
Facebook
TwitterDatabase developed to archive and distribute clinical data and results from studies that have investigated interaction of genotype and phenotype in humans. Database to archive and distribute results of studies including genome-wide association studies, medical sequencing, molecular diagnostic assays, and association between genotype and non-clinical traits.
Facebook
TwitterThis dataset tracks the updates made on the dataset "Database of Genotype and Phenotype (dbGaP)" as a repository for previous versions of the data and metadata.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Genome-wide association studies (GWAS) usually rely on the assumption that different samples are not from closely related individuals. Detection of duplicates and close relatives becomes more difficult both statistically and computationally when one wants to combine datasets that may have been genotyped on different platforms. The dbGaP repository at the National Center of Biotechnology Information (NCBI) contains datasets from hundreds of studies with over one million samples. There are many duplicates and closely related individuals both within and across studies from different submitters. Relationships between studies cannot always be identified by the submitters of individual datasets. To aid in curation of dbGaP, we developed a rapid statistical method called Genetic Relationship and Fingerprinting (GRAF) to detect duplicates and closely related samples, even when the sets of genotyped markers differ and the DNA strand orientations are unknown. GRAF extracts genotypes of 10,000 informative and independent SNPs from genotype datasets obtained using different methods, and implements quick algorithms that enable it to find all of the duplicate pairs from more than 880,000 samples within and across dbGaP studies in less than two hours. In addition, GRAF uses two statistical metrics called All Genotype Mismatch Rate (AGMR) and Homozygous Genotype Mismatch Rate (HGMR) to determine subject relationships directly from the observed genotypes, without estimating probabilities of identity by descent (IBD), or kinship coefficients, and compares the predicted relationships with those reported in the pedigree files. We implemented GRAF in a freely available C++ program of the same name. In this paper, we describe the methods in GRAF and validate the usage of GRAF on samples from the dbGaP repository. Other scientists can use GRAF on their own samples and in combination with samples downloaded from dbGaP.
Facebook
TwitterThe database of Genotypes and Phenotypes (dbGaP) archives and distributes the results of studies that have investigated the interaction of genotype and phenotype.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Mean HGMR and AGMR values and correlation coefficients between HGMR and AGMR of all related subjects reported in the data files submitted to dbGaP.
Facebook
TwitterHuman genetics data from an immense (78,000) and ethnically diverse population available for secondary analysis to qualified researchers through the database of Genotypes and Phenotypes (dbGaP). It offers the opportunity to identify potential genetic risks and influences on a broad range of health conditions, particularly those related to aging. The GERA cohort is part of the Research Program on Genes, Environment, and Health (RPGEH), which includes more than 430,000 adult members of the Kaiser Permanente Northern California system. Data from this larger cohort include electronic medical records, behavioral and demographic information from surveys, and saliva samples from 200,000 participants obtained with informed consent for genomic and other analyses. The RPGEH database was made possible largely through early support from the Robert Wood Johnson Foundation to accelerate such health research. The genetic information in the GERA cohort translates into more than 55 billion bits of genetic data. Using newly developed techniques, the researchers conducted genome-wide scans to rapidly identify single nucleotide polymorphisms (SNPs) in the genomes of the people in the GERA cohort. These data will form the basis of genome-wide association studies (GWAS) that can look at hundreds of thousands to millions of SNPs at the same time. The RPGEH then combined the genetic data with information derived from Kaiser Permanente''s comprehensive longitudinal electronic medical records, as well as extensive survey data on participants'' health habits and backgrounds, providing researchers with an unparalleled research resource. As information is added to the Kaiser-UCSF database, the dbGaP database will also be updated.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Predicted HGMR values and standard deviations for different types of relationships assuming allele frequencies are evenly distributed between 0.1 and 0.9.
Facebook
TwitterSubtypes are provided for each of the 555 Stage 1 GWAS samples included in the clustering analysis according to their dbGaP SUBJIDs. dbGaP, database of Genotypes and Phenotypes; GWAS, genome-wide association study; SUBJID, subject ID. (TXT)
Facebook
TwitterThe National Institute on Aging Genetics of Alzheimer's Disease Data Storage Site (NIAGADS) is a national genetics data repository facilitating access to genotypic and phenotypic data for Alzheimer's disease (AD). Data include GWAS, whole genome (WGS) and whole exome (WES), expression, RNA Seq, and CHIP Seq analyses. Data for the Alzheimer’s Disease Sequencing Project (ADSP) are available through a partnership with dbGaP (ADSP at dbGaP). Results are integrated and annotated in the searchable genomics database that also provides access to a variety of software packages, analytic pipelines, online resources, and web-based tools to facilitate analysis and interpretation of large-scale genomic data. Data are available as defined by the NIA Genomics of Alzheimer’s Disease Sharing Policy and the NIH Genomics Data Sharing Policy. Investigators return secondary analysis data to the database in keeping with the NIAGADS Data Distribution Agreement.
Facebook
TwitterThe Framingham Heart Study (FHS) is a prospective cohort study of 3 generations of subjects who have been followed up to 65 years to evaluate risk factors for cardiovascular disease. Its large sample of ~15,000 men and women who have been extensively phenotyped with repeated examinations make it ideal for the study of genetic associations with cardiovascular disease risk factors and outcomes. DNA samples have been collected and immortalized since the mid-1990s and are available on ~8000 study participants in 1037 families. These samples have been used for collection of GWAS array data and exome chip data in nearly all with DNA samples, and for targeted sequencing, deep exome sequencing and light coverage whole genome sequencing in limited numbers. Additionally, mRNA and miRNA expression data, DNA methylation data, metabolomics and other 'omics data are available on a sizable portion of study participants. This project will focus on deep whole genome sequencing (mean 30X coverage) in ~4100 participants and imputed to all with GWAS array data to more fully understand the genetic contributions to cardiovascular, lung, blood and sleep disorders. Also available are aptamer proteomic profiling, RNAseq and 850K array DNA methylation data that predominantly overlap with participants with WGS data. Comprehensive phenotypic and pedigree data for study participants are available through dbGaP phs000007.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Probabilities of different IBD and IBS states for different relationships.
Facebook
TwitterThis study includes samples from two projects: Collaborative Genetic Study of Nicotine Dependence (COGEND; PI: Laura Bierut) and Genetic Study of Nicotine Dependence in African Americans (AAND; PI: Laura Bierut and Eric Johnson). The majority of the COGEND subjects included in the current study overlap with the two datasets already available on dbGaP. GWAS data are available for COGEND subjects through the Study of Addiction: Genetics and Environment (SAGE), dbGaP study accession phs000092. It should be noted that the case definition in the SAGE study is DSM-IV alcohol dependence. GWAS data are available for additional COGEND subjects through The Genetic Architecture of Smoking and Smoking Cessation, dbGaP study accession phs000404. The overall goal of this project is to apply deep sequencing to key genomic regions associated with nicotine dependence in order to accelerate the discovery of variation in molecular pathways that govern the development of... (for more see dbGaP study page.)
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Kinship = kinship coefficient estimated by KING.
Facebook
TwitterThe NIMH Repository and Genomics Resource (RGR) stores biosamples, genetic, pedigree and clinical data collected in designated NIMH-funded human subject studies. The RGR database likewise links to other repositories holding data from the same subjects, including dbGAP, GEO and NDAR. The NIMH RGR allows the broader research community to access these data and biospecimens (e.g., lymphoblastoid cell lines, induced pluripotent cell lines, fibroblasts) and further expand the genetic and molecular characterization of patient populations with severe mental illness.
Facebook
TwitterNational genetics data repository facilitating access to genotypic and phenotypic data for Alzheimer's disease (AD). Data include GWAS, whole genome (WGS) and whole exome (WES), expression, RNA Seq, and CHIP Seq analyses. Data for the Alzheimer’s Disease Sequencing Project (ADSP) are available through a partnership with dbGaP (ADSP at dbGaP). Repository for many types of data generated from NIA supported grants and/or NIA funded biological samples. Data are deposited at NIAGADS or NIA-approved sites. Genetic Data and associated Phenotypic Data are available to qualified investigators in scientific community for secondary analysis.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Running times and prediction accuracies of the sub-quadratic algorithm tested with datasets of different sample sizes and genotype missing rates.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Comparison of the performances of the GRAF quadratic algorithm and KING 2.0 on finding identical pairs with different numbers of SNPs with genotypes.
Facebook
Twitterhttps://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
The Cancer Moonshot Biobank is a National Cancer Institute initiative to support current and future investigations into drug resistance and sensitivity and other NCI-sponsored cancer research initiatives, with an aim of improving researchers' understanding of cancer and how to intervene in cancer initiation and progression. During the course of this study, biospecimens (blood and tissue removed during medical procedures) and associated data will be collected longitudinally from at least 1000 patients across at least 10 cancer types, who represent the demographic diversity of the U.S. and receiving standard of care cancer treatment at multiple NCI Community Oncology Research Program (NCORP) sites.
This collection contains de-identified radiology and histopathology imaging procured from subjects in NCI’s Cancer Moonshot Biobank - Acute Myeloid Leukemia Cancer (CMB-AML) cohort. Associated genomic, phenotypic and clinical data will be hosted by The Database of Genotypes and Phenotypes (dbGaP) and other NCI databases. A summary of Cancer Moonshot Biobank imaging efforts can be found on the Cancer Moonshot Biobank Imaging page.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Comparison of the performances of the GRAF quadratic algorithm and KING 2.0 on finding identical pairs with different AGMR ranges.
Facebook
TwitterDatabase of Genotype and Phenotype (dbGaP) was developed to archive and distribute the data and results from studies that have investigated the interaction of genotype and phenotype in Humans.