Facebook
TwitterCatalog of published genome-wide association studies. Genome-wide set of genetic variants in different individuals to see if any variant is associated with trait and disease. Database of genome-wide association study (GWAS) publications including only those attempting to assay single nucleotide polymorphisms (SNPs). Publications are organized from most to least recent date of publication. Studies are identified through weekly PubMed literature searches, daily NIH-distributed compilations of news and media reports, and occasional comparisons with an existing database of GWAS literature (HuGE Navigator). Works with HANCESTRO ancestry representation.
Facebook
TwitterGWAS Central (previously the Human Genome Variation database of Genotype-to-Phenotype information) is a database of summary level findings from genetic association studies, both large and small. It gathers datasets from public domain projects, and accepts direct data submission. It is based upon Marker information encompassing SNP and variant information from public databases, to which allele and genotype frequency data, and genetic association findings are additionally added. A Study (most generic level) contains one or more Experiments, one or more Sample Panels of test subjects, and one or more Phenotypes. This collection references a GWAS Central Study.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Rows are as follows: (1) “Totals”: number of samples of a given ancestry in analyzed papers, with redundancy between studies published multiple times; (2) “Rate in GWAS”: percentage of total samples considered that were of this ancestry; (3) “Rate in Population”: percentage of world’s population that is of this ancestry; (4) “Enrichment in GWAS”: relative over (or under) representation of ancestry in GWAS relative to its rate in the world. Ancestry labels are approximations with the standard correspondences to HapMap2 reference samples (European = CEU, East Asian = JPT+CHB, African = YRI); here, “African American” denotes samples reported with that nomenclature, which typically corresponds to 80:20 admixture between ancestral sub-Saharan African and Western European genetics [11]. All of these equivalences are oversimplifications but correspond to assumptions widely used in the field. Counts are computed from totals across all papers analyzed in this study, not adjusting for duplicate uses of the same datasets across multiple studies. Total sample sizes are maximum counts of samples assuming no per-genotype missingness is present. The totals are rounded to the nearest integer as several imputed studies reported nonintegral sample sizes. Row 3 percentages in world population are approximations based on demographic data from 2014–2015 [12, 13].
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Genome-wide association study results for the metabolism of phosphatidylcholine (PC) (16:0/16:0) and dulcitol.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
GWAS SVatalog is a novel visualization tool and database for structural variants (SV) found in a predominantly European population of 101 individuals with Cystic Fibrosis (CF). Aside from the CF-causing variants on chromosome 7 and the LD block in which they lie, the remainder of the genome is comparable to a the 1000 Genomes healthy European population. This data is a collection of SV calls and their linkage disequilibrium (LD) statistics with GWAS-significant SNPs reported in the GWAS Catalog.
The goal of this project is to provide a resource to aid fine mapping of GWAS loci using SVs. GWAS loci are generally identified by SNPs which account for an incomplete proportion of genetic variation and phenotypic heritability. Their relevance to the phenotype might be limited, tagging other polymorphisms, such as SVs, that could be the cause of the association signal. To leverage this data to its full potential, visit the GWAS SVatalog web tool. Here, interactive visualizations can illustrate SVs identified in high LD with GWAS-significant SNPs, suggesting putative causal variation that could guide additional functional investigation.
For more information on how to use GWAS SVatalog, visit the documentation.
This project was accomplished in collaboration with the Strug Lab at The Hospital for Sick Children (SickKids), The Center for Applied Genomics (TCAG), and University of Toronto.
Facebook
TwitterCombines collections of genetic variants (GVs) from GWAS and their comprehensive functional annotations, as well as disease classifications. Used to maximize utilility of GWAS data to gain biological insights through integrative, multi-dimensional functional annotation portal. In addition to all GVs annotated in NHGRI GWAS Catalog, we manually curate GVs that are marginally significant (P value < 10-3) by looking into supplementary materials of each original publication and provide extensive functional annotations for these GVs. GVs are manually classified by diseases according to Disease Ontology Lite and HPO (Human Phenotype Ontology) for easy access. Database can also conduct gene based pathway enrichment and PPI network association analysis for those diseases with sufficient variants. SOAP services are available. You may Download GWASdb SNP. (This file contains all of the significant SNP in GWASdb. In the pvalue column, 0 means this P-value is not reported in the study but it is significant SNP. In the source column, GWAS:A represents the original data in GWAS catalog, while GWAS:B is our curation data which P-value < 10-3)
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Genome-wide association studies (GWAS) have identified hundreds of SNPs responsible for variation in human quantitative traits. However, genome-wide-significant associations often fail to replicate across independent cohorts, in apparent inconsistency with their apparent strong effects in discovery cohorts. This limited success of replication raises pervasive questions about the utility of the GWAS field. We identify all 332 studies of quantitative traits from the NHGRI-EBI GWAS Database with attempted replication. We find that the majority of studies provide insufficient data to evaluate replication rates. The remaining papers replicate significantly worse than expected (p < 10−14), even when adjusting for regression-to-the-mean of effect size between discovery- and replication-cohorts termed the Winner’s Curse (p < 10−16). We show this is due in part to misreporting replication cohort-size as a maximum number, rather than per-locus one. In 39 studies accurately reporting per-locus cohort-size for attempted replication of 707 loci in samples with similar ancestry, replication rate matched expectation (predicted 458, observed 457, p = 0.94). In contrast, ancestry differences between replication and discovery (13 studies, 385 loci) cause the most highly-powered decile of loci to replicate worse than expected, due to difference in linkage disequilibrium.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The data includes genotypes of 482906 markers for 1,000 individuals. They come from a simulation based on the Illumina 650K human array, typically used for SNP genotyping.
In theory, it's easy to create such data, it's just columns with values of 0, 1 and 2, but what's important is the correlation structure that has been preserved here and corresponds to the real one.
The data can be used to test methods for finding significant SNPs. You can generate a trait based on the significant variables of your choice, and then try to find them using the chosen technique (which is not easy, due to the huge number of variables).
The y.txt file contains the trait I simulated based on the following list of 24 SNPs: - ch01_19810 - ch01_27796 - ch01_32763 - ch02_22034 - ch02_39189 - ch03_2703 - ch03_10846 - ch04_05127 - ch05_7371 - ch06_25838 - ch08_15190 - ch10_444 - ch10_8265 - ch11_12611 - ch11_20057 - ch12_3421 - ch14_6999 - ch15_3859 - ch16_4525 - ch17_4306 - ch18_1031 - ch19_1377 - ch19_6378 - ch22_33
See which ones you can find!
Facebook
TwitterThis deposit provides full details of the genome wide association study (GWAS) pipeline developed by the MRC-IEU for the full UK Biobank (version 3, March 2018) genetic data. For any issues with use of this documentation please contact: mrc-ieu@bristol.ac.uk. This dataset supersedes the earlier version at https://doi.org/10.5523/bris.2fahpksont1zi26xosyamqo8rr
Facebook
TwitterPublicly available database of summary level findings from genetic association studies in humans, including genome wide association studies (GWAS). Previously named HGBASE, HGVbase and HGVbaseG2P.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Summary of gwas data.
Facebook
TwitterA repository database to achieve continuous and intensive management of GWAS data and variation data identified by next generation sequencing (NGS) and data-sharing among researchers. In this database, variations including short/long insertions / deletions and structural variations related to disease susceptibility, virus resistance, and drug response are registered along with statistical genetic results and simple clinical characteristics to clarify the locus specific characteristics. Currently this database contains information extracted from scientific papers and next generation sequencing results and other small scale experimental results of several research laboratories. Mutation data submission is greatly appreciated.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This deposit contains summary statistics of genome-wide analysis for dental endpoints in GLIDE consortium, and UK Biobank as well as multi-trait analysis combining both GLIDE and UKBiobank. A full description of the files is included as a readme, and a full description of the workflow is described in the linked manuscript (under review). Contributing studies, people and funders are acknowledged in the linked manuscript.
Facebook
TwitterThe GWAS Catalog provides a consistent, searchable, visualisable and freely available database of published SNP-trait associations, which can be easily integrated with other resources, and is accessed by scientists, clinicians and other users worldwide.
Facebook
TwitterThe Epilepsy Genetic Association Database (epiGAD) is an online repository of data relating to genetic association studies in the field of epilepsy. It summarizes the results of both published and unpublished studies, and is intended as a tool for researchers in the field to keep abreast of recent studies, providing a bird''s eye view of this research area. The goal of epiGAD is to collate all association studies in epilepsy in order to help researchers in this area identify all the available gene-disease associations. Finally, by including unpublished studies, it hopes to reduce the problem of publication bias and provide more accurate data for future meta-analyses. It is also hoped that epiGAD will foster collaboration between the different epilepsy genetics groups around the world, and faciliate formation of a network of investigators in epilepsy genetics. There are 4 databases within epiGAD: - the susceptibility genes database - the epilepsy pharmacogenetics database - the meta-analysis database - the genome-wide association studies (GWAS) database The susceptibility genes database compiles all studies related to putative epilepsy susceptibility genes (eg. interleukin-1-beta in TLE), while the pharmacogenetics studies in epilepsy (eg. ABCB1 studies) are stored in ''phamacogenetics''. The meta-analysis database compiles all existing published epilepsy genetic meta-analyses, whether for susceptibility genes, or pharmacogenetics. The GWAS database is currently empty, but will be filled once GWAS are published. Sponsors: The epiGAD website is supported by the ILAE Genetics Commission.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Levels of sociability are continuously distributed in the general population, and decreased sociability represents an early manifestation of several brain disorders. Here, we investigated the genetic underpinnings of sociability in the population.Main question of our research: 1. Are there common genetic variants that are associated with sociability in the general population? 2. Are genetic variants that are associated with sociability also associated with neuropsychiatric disorders?Type of data uploaded in this repository:The UK Biobank project (see https://www.ukbiobank.ac.uk/) is a large-scale biomedical database and research resource, containing in-depth genetic and health information from half a million UK participants. The database is globally accessible to approved researchers undertaking vital research into the most common and life-threatening diseases. The raw data that this project is based on comes from the publically available UK Biobank set, which is very large and is therefore not provided here. Here we only provide the results from our analysis, that is also described here: https://www.biorxiv.org/content/10.1101/781195v2 and currently in revision in a scientific journal. In the dataset you will find the association of 9327396 genetic variants with the phenotype sociability. This dataset is not applicable to be opened with Excel, and can best be opened on a cluster computer or using specfic software.SubjectsThe UK Biobank (UKBB) is a major population-based cohort from the United Kingdom that includes individuals aged between 37 and 73 years. We constructed a sociability measure based on the the aggregation of scores per participant on four questions from the UKBB database that link to sociability, including (1) a question about the frequency of friend/family visits, (2) a question on the number and type of social venues that are visited, (3) a question about worrying after social embarrassment and (4) a question about feeling lonely, leading to a sociability score ranging from 0-4. Participants were excluded if they had somatic problems that could be related to social withdrawal (BMI < 15 or BMI > 40, narcolepsy (all the time), stroke, severe tinnitus, deafness or brain-related cancers) or if they answered that they had “No friends/family outside household” or “Do not know” or “Prefer not to answer” to any of the questions.SNP genotyping and quality controlDetails about the available genome-wide genotyping data for UKBB participants have been reported previously (PMID: 30305743). We used third-release genotyping data (see https://biobank.ctsu.ox.ac.uk/crystal/label.cgi?id=100319). Briefly, 49,950 participants were genotyped using the UK BiLEVE Axiom Array and 438,427 participants were genotyped using UK Biobank Axiom Array. Genotypes were imputed into the dataset using the Haplotype Reference Consortium (HRC), and the UK10K haplotype resource. To account for ethnicity, we included only those individuals that identified themselves as "white" by self-report and plotted the Principal Components (PC) provided by the UKBB, excluding individuals considered to be outliers according to PCs 1 and 2. Genetic relatedness calculated with KING kinship and provided by the UKBB (https://kenhanscombe.github.io/ukbtools/articles/explore-ukb-data.html ; http://www.ukbiobank.ac.uk/wp-content/uploads/2014/04/UKBiobank_genotyping_QC_documentation-web.pdf) was used to identify first and second-degree relatives. Subsequently ´families´ (i.e. clusters of related individuals above an IBD>0.125 threshold) were created and only one individual from each of these created ‘families’ was included in the analysis. If self-reported sex and SNP-based sex differed, individuals were excluded from further analysis. Single nucleotide polymorphisms (SNPs) with minor allele frequency <0.005, Hardy-Weinberg equilibrium test P value<1e−6, missing genotype rate >0.05, and imputation quality of INFO <0.8 were excluded. In the current study, all analyses are based on 342,461 participants of European ancestry for which both genotype data and sociability scores were available.Genome-wide association analysisGenome-wide association analysis with the imputed marker dosages was performed in PLINK1.9, using a linear regression model with the sociability measure as the dependent variable and including sex, age, 10 first PCs, assessment center, and genotype batch as covariates. SNPs were considered significantly associated if they had p-value < 5e-8. Associated loci were considered independent of each other at r2 0.6 and lead SNPs were classified as the SNP with the smallest association p-value and at r2 0.1, using a 250kb window.The summary statistics come from the plink2 linear regression analysis.
Facebook
TwitterGWAS data sets with individual level data.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The files used in the real data analyses of the paper entitled "Eagle: Making multiple-locus association mapping on a genome-wide scale routine". The original data comes from Nicod et al. (2016) Nature Genetics who collected SNP dosages (continuous scores from 0 to1). We transformed these dosages into SNP genotypes (0,1,2).
Input files suitable for analysis by Eagle. The data were obtained from a large genome-wide association study (GWAS) performed in outbred mice and published on the Heterogeneous Stock Mouse web site. The data were reorganized to make it suitable for input into Eagle.
Lineage: The original data was obtained from the Heterogeneous Stock Mouse web site at http://wp.cs.ucl.ac.uk/outbredmice/heterogeneous-stock-mice/. The data required further processing by R scripts to convert the SNP dosages into SNP genotypes.
Facebook
TwitterInformation about the 4 GWAS data sets used in this study.
Facebook
TwitterCatalog of published genome-wide association studies. Genome-wide set of genetic variants in different individuals to see if any variant is associated with trait and disease. Database of genome-wide association study (GWAS) publications including only those attempting to assay single nucleotide polymorphisms (SNPs). Publications are organized from most to least recent date of publication. Studies are identified through weekly PubMed literature searches, daily NIH-distributed compilations of news and media reports, and occasional comparisons with an existing database of GWAS literature (HuGE Navigator). Works with HANCESTRO ancestry representation.