Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Full name of each phenotype is discussed in Material and Methods section. GCTA software is utilized to estimate the phenotypic variance and its standard error for each phenotype.
Objective: To discover common genetic variants associated with post-stroke outcomes using a genome-wide association (GWA) study.
Methods: The study comprised 6,165 patients with ischemic stroke from 12 studies in Europe, USA and Australia included in the Genetics of Ischaemic Stroke Functional Outcome (GISCOME) network. The primary outcome was modified Rankin Scale (mRS) score after 60-190 days, evaluated as two dichotomous variables (0-2 versus 3-6 and 0-1 versus 2-6) and subsequently as an ordinal variable. GWA analyses were performed in each study independently and results were meta-analyzed. Analyses were adjusted for age, sex, stroke severity (baseline NIH Stroke Scale score), and ancestry. The significance level was P<5×10-8.
Results: We identified one genetic variant associated with functional outcome with genome-wide significance (mRS 0-2 vs 3-6, P=6.8×10-9). This intronic variant (rs1842681) in the LOC105372028 gene, is a previously reported trans-eQTL for PPP1R21, which e...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Determination of the relevance of both demanding classical epidemiologic criteria for control selection and robust handling of population stratification (PS) represents a major challenge in the design and analysis of genome-wide association studies (GWAS). Empirical data from two GWAS in European Americans of the Cancer Genetic Markers of Susceptibility (CGEMS) project were used to evaluate the impact of PS in studies with different control selection strategies. In each of the two original case-control studies nested in corresponding prospective cohorts, a minor confounding effect due to PS (inflation factor λ of 1.025 and 1.005) was observed. In contrast, when the control groups were exchanged to mimic a cost-effective but theoretically less desirable control selection strategy, the confounding effects were larger (λ of 1.090 and 1.062). A panel of 12,898 autosomal SNPs common to both the Illumina and Affymetrix commercial platforms and with low local background linkage disequilibrium (pair-wise r2
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
An East Asian-specific variant on aldehyde dehydrogenase 2 (ALDH2 rs671, G>A) is the major genetic determinant of alcohol consumption. We performed an rs671 genotype-stratified genome-wide association study meta-analysis of alcohol consumption in 175,672 Japanese individuals to explore gene-gene interactions with rs671 behind drinking behavior. The analysis identified three genome-wide significant loci (GCKR, KLB, and ADH1B) in wild-type homozygotes and six (GCKR, ADH1B, ALDH1B1, ALDH1A1, ALDH2, and GOT2) in heterozygotes, with five showing genome-wide significant interaction with rs671. Genetic correlation analyses revealed ancestry-specific genetic architecture in heterozygotes. Of the discovered loci, four (GCKR, ADH1B, ALDH1A1, and ALDH2) were suggested to interact with rs671 in the risk of esophageal cancer, a representative alcohol-related disease. Our results identify the genotype-specific genetic architecture of alcohol consumption and reveal its potential impact on alcohol-related disease risk. Methods We performed an rs671 genotype-stratified GWAS meta-analysis of alcohol consumption within six Japanese cohorts. To validate the stratified approach, we applied a joint meta-analysis (JMA). Further, to validate the impact of the discovered loci on alcohol-related disease, we performed a meta-analysis of two esophageal cancer case-control studies. GWAS meta-analysis Study subjects and genotyping: We performed a genome-wide meta-analysis based on the Japanese Consortium of Genetic Epidemiology studies (J-CGE) (1), the Nagahama Study (2), and the BBJ Study (3,4). The J-CGE consisted of the following Japanese population-based and hospital-based studies: the HERPACC Study (5), the J-MICC Study (6,7), the JPHC Study (8), and the TMM Study (9). Individual study descriptions and an overview of the characteristics of the study populations are provided in the Supplementary Information and table S1. Data and sample collection for the participating cohorts were approved by the respective research ethics committees. All participating studies obtained informed consent from all participants by following the protocols approved by their institutional ethical committees.
Phenotype: Information on alcohol consumption was collected by questionnaire in each study. Because the questionnaires were not homogeneous across the studies, we harmonized the two alcohol consumption phenotypes of drinking status (never versus ever drinker) and daily alcohol intake (g/day) in accordance with each study’s criterion. Details are provided in the Supplementary Information.
Quality control and genotype imputation: Quality control for samples and SNPs was performed based on study-specific criteria (table S2). Genotype data in each study were imputed separately based on the 1000 Genomes Project reference panel (Phase 3, all ethnicities) (10). Phasing was performed with the use of SHAPEIT (v2) (11) and Eagle (12), and imputation was performed using minimac3 (13), minimac4, or IMPUTE (v2) (14). Information on the study-specific genotyping, imputation, quality control, and analysis tools is provided in table S2. After genotype imputation, further quality control was applied to each study. SNPs with an imputation quality of r2 < 0.3 for minimac3 or minimac4, info < 0.4 for IMPUTE2 or an MAF of <0.01 were excluded.
Association analysis of SNPs with daily alcohol intake and drinking status: Association analysis of SNPs with daily alcohol intake and drinking status was performed on three different subject groups: the entire population, subjects with the rs671 GG genotype only, and subjects with the rs671 GA genotype only. Because the number of ever drinkers with the rs671 AA genotype was too small (table S3), association analysis in subjects with the rs671 AA genotype only was not conducted. Daily alcohol intake was base-2 log-transformed (log2 (grammes/day + 1)). The association of daily alcohol intake with SNP allele dose for each study was assessed by linear regression analysis with adjustment for age, age2, sex, and the first 10 principal components. For the BBJ Study, the affection status of 47 diseases was further added as covariates. The association of drinking status with SNP allele dose for each study was assessed by logistic regression analysis with adjustment for age, age2, sex, the first 10 principal components, and disease affection status of 47 diseases (for the BBJ Study). The effect sizes and standard errors estimated in the association analysis were used in the subsequent meta-analysis. The association analysis was conducted using EPACTS (http://genome.sph.umich.edu/wiki/EPACTS), SNPTEST (15), or PLINK2 (16). Association analysis, including interaction terms, was performed to evaluate the differential effects of each SNP on daily alcohol intake and drinking status between the GG and GA genotypes of rs671. Carriers of the AA genotype were excluded from the analysis. The effect sizes of the interaction term, ?interaction, and its standard errors estimated in the association analysis were used in the subsequent meta-analysis. The association analysis, including the interaction term, was conducted using PLINK2 (16). To identify studies with inflated GWAS significance, which can result from population stratification, we computed the intercept from LDSC (17). Before the meta-analysis, all study-specific results in the association analysis were corrected by multiplying the standard error of the effect size by the value of intercept from LDSC if the intercept of that study was greater than 1.
Meta-analysis: The meta-analysis was performed with all Japanese subjects in the six cohorts (table S1). The results of association analyses for each SNP across the studies were combined with METAL software (18) by the fixed-effects inverse-variance-weighted method. Heterogeneity of effect sizes was assessed by I2 and Cochran’s Q statistic. The meta-analysis included SNPs for which genotype data were available from at least three studies with a total sample size of at least 20,000 individuals for unstratified GWAS or interaction GWAS or 10,000 individuals for rs671-stratified GWAS. The genome-wide significance level α was set to a P value <5 × 10–8. P-values with <1.0×10−300 was calculated with Rmpfr of the R package. To assess the inflation of the test statistics for the meta-analysis, we computed the genomic inflation factor, l, and intercept from LDSC (19).
JMA We used the JMA approach (20,21). The JMA jointly tests both SNP main effects ?SNP and SNP × rs671 interaction effects ?interaction for spherical equivalent with a fixed-effects model, using ?SNP and ?interaction and a ?’s covariance matrix from each study. To perform the JMA, the same model as the interaction analysis for each study described above was analyzed using GEM v1.4 (22), which is capable of obtaining robust covariance matrices for ?SNP and ?interaction. To control false positives, only SNPs with MAF ≥ 0.05 were analyzed by the GEM for each study. The JMA was conducted with the fixed effects method using METAL software (version 2010-02-08) (18) and patch source code provided by Manning et al. (20). A Wald’s statistic, following a ?2-distribution with two degrees of freedom (d.f.), was used to test the joint significance of the ?SNP and ?interaction. A Cochran’s Q-test was used to assess the heterogeneity of the ?-coefficients across studies for the ?SNP and ?interaction. The cor value was calculated by cor = (IntCov/StdErr × IntStdErr). IntCov is the covariance between ?SNP and ?interaction estimated by the JMA. StdErr and IntStdErr are standard errors of ?SNP and ?interaction estimated by the JMA, respectively. The JMA included SNPs for which genotype data were available from at least three studies with a total sample size of at least 20,000 individuals for interaction GWAS. To control false positives, SNPs with evidence of between study heterogeneity (HetP < 0.001) and cor < 0.7 were excluded (fig. S7). Genomic control correction was applied by calculating ? as the ratio of the observed and expected (2 d.f.) median ?2 statistics and dividing the observed ?2 statistics by ?. The genome-wide significance level ? for the JMA test was set to a P value <5 × 10–8.
Esophageal cancer case-control study Study sample: In the HERPACC Study, we included 692 cases and 995 age- and sex-matched controls who were selected from participants in the HERPACC-2 (2001–2005) (23) and HERPACC-3 (2005–2013) (24). Cases were first-visit outpatients at Aichi Cancer Center Hospital who were diagnosed with esophageal cancer within -3 to +12 months of the first visit. Controls were first-visit outpatients who were confirmed to have no cancer or history of neoplasm. The BBJ Study included 416 cases and 86,515 controls after excluding (1) outliers from the Japanese cluster, as estimated by principal component analysis with samples of the 1000 Genomes project (10); and (2) closely related individuals estimated by King (25) (specifically, King kinship coefficients > 0.09375). Cases were diagnosed with esophageal cancer within -3 to +12 months from the date of consent. Controls were those confirmed to have no cancer or history of neoplasm. In the HERPACC Study, esophageal cancer cases were identified using the International Classification of Diseases for Oncology, Third Edition (ICD-O-3) (26) topography code C15. As a sensitivity analysis, we performed an additional analysis restricted to cases with squamous cell carcinoma identified using the ICD-O-3 morphology codes of 8050–8078 and 8083–8084, resulting in 636 cases. In the BBJ Study, all participants had been diagnosed with at least one of 47 target diseases, including esophageal cancer, by physicians at the cooperating hospitals. Esophageal cancer histology was determined from excised tissue specimens, and missing histological data were complemented by cytological specimens, resulting in 348 cases of squamous cell
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Additional file 2: Table S1. Information on the significant SNPs associated with 42 DW in the QTL region on chromosome 4. The data present the genomic location, effect, heritability, P value, and functional annotation of each GWAS significant SNP associated with 42 DW on chromosome 4. Table S2. Information on the significant SNPs associated with 84 DW in the QTL region on chromosome 4. The data present the genomic location, effect, heritability, P value, and functional annotation of each GWAS significant SNP associated with 84 DW on chromosome 4. Table S3. Information on the significant SNPs associated with ADG in the QTL region on chromosome 4. The data present the genomic location, effect, heritability, P value, and functional annotation of each GWAS significant SNP associated with ADG on chromosome 4. Table S4. Information on the significant SNPs associated with FCR in the QTL region on chromosome 4. The data present the genomic location, effect, heritability, P value, and functional annotation of each GWAS significant SNP associated with FCR on chromosome 4. Table S5. Average prediction inflation based on tenfold cross-validation. The data provide the average prediction inflation using ABLUP, GBLUP or GFBLUP models. Table S6. QTL contributions to the theoretical prediction accuracy (%). Description: The contributions were estimated by setting GWAS significant SNPs in each QTL as fixed factors. Table S7. Average prediction inflation addition after setting GWAS significant SNPs as fixed factors. The average prediction inflation changes after setting GWAS significant SNPs as fixed factors, where values approaching 0 represent lower inflations. Table S8. QTL contributions to heritability and observed prediction accuracy (%) using the GFBLUP model. The contributions were evaluated by estimating the contribution of two GRM using the GFBLUP model. Table S9. Average theoretical prediction accuracy using two GRM in the GFBLUP model. The data provide the average theoretical accuracy of prediction based on the GRM of GWAS significant sites and the remaining sites, respectively. Table S10. Average prediction inflation addition in the GFBLUP model. The changes in average prediction inflation evaluated by using the GRM without GWAS significant SNPs compared to that using both GRM, where values approaching 0 represent lower inflations.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Table S1. Summary of GWAS data. Table S2. SNP based heritability and genomic inflation factor estimated by LDSC. Table S3. Evaluation of genetic correlation between COPD and CVD related metabolic traits. Table S4. Partitioned genetic correlation between COPD and 3 cardiac traits. Table S5. Local genetic covariance analysis between COPD and RHR (only P
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Additional file 2: Table S1. Inflation factor (lambda) for each trait. The inflation of observed chi-squared statistic due to population structure was checked. Table S2. Annotation of pleiotropic effects on individual traits for 178 putative causal variants identified from the multi-trait conditional GWAS. The annotation includes positions, functional classification, rs ID where available and signed significant t-values (|t| > 1.96) were provided for 178 variants across the 16 traits at two ages. Gene names and distances from the nearest genes are also provided if variants are located in or within 100 kb of the gene. Table S3. Annotation of 1510 sequence variants in strong LD with the 178 most significant variants from multi-trait conditional GWAS. The annotation includes the positions, rs ID where available, functional classification and P values of multi-trait \(\chi ^{2}\) χ 2 statistic. The names of the nearest genes are also provided if variants are located in or within 100 kb of the gene. Table S4. List of 453 published genes previously associated with fibre characteristics in mammals (“Published-Gene set”) used to identify sequence variants to be includes in BayesR and GBLUP analysis. Ensembl identification, name of genes and gene location (chromosome number, start and end position in bp).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Additional file 1: Table S1. Blood traits, cancer types and GWAS data sources. Table S2. Plasma samples of women carriers of pathogenic variants in BRCA1/2, affected or unaffected by breast cancer after blood test (< 12 months) and used for circulating sRNA-seq. Table S3. Plasma samples of sporadic women affected or unaffected by breast cancer after blood test (< 12 months) and used for sRNA-seq. Table S4. Multivariate Cox regression analysis of cancer diagnosis in UKBB (all cancers; >12 months from basal blood test). Table S5. Multivariate Cox regression analysis of cancer diagnosis in UKBB (all cancers; within 12 months from basal blood test). Table S6. Multivariate Cox regression analysis of cancer diagnosis in women of the UKBB (all cancers; >12 months from basal blood test). Table S7. Multivariate Cox regression analysis of cancer diagnosis in men of the UKBB (all cancers; >12 months from basal blood test). Table S8. Patient and incident cases included in the analyses. Table S9. Multivariate Cox regression analysis of breast cancer diagnosis in UKBB (>12 months from basal blood test). Table S10. Multivariate Cox regression analysis of colon cancer diagnosis in UKBB (>12 months from basal blood test). Table S11. Multivariate Cox regression analysis of lung cancer diagnosis in UKBB (>12 months from basal blood test). Table S12. Multivariate Cox regression analysis of prostate cancer diagnosis in UKBB (>12 months from basal blood test). Table S13. Heritability and genetic correlations between blood cell traits and cancer risk. Table S14. Genomic inflation (lambda factor) analysis for the comparisons between cancer risk and blood trait GWAS results. Table S15. Pleiotropy leading SNPs linking blood traits and cancer risk. Table S16. Pan-cancer pleiotropic SNPs (Rashkin et al., 2020) identified in the blood-cancer pleiotropy study (conjFDR < 0.05). Table S17. Pleiotropic gene candidates previously associated with leukocyte telomere length (Codd et al., 2021). Table S18. Genomic hotspots (1, 3, or 5 Mb) with significant enrichment in pleiotropic variants and linked to > 2 cancer traits. Table S19. Regulatory marks enriched in the blood-cancer pleiotropic variants (DNAse I hypersensitivity (sheffield_dnase), transcription factor binding sites (encode_tfbs), and epigenetic marks (oadmap_epigenomics) data). Table S20. Master regulators of hematopoiesis. Table S21. Pleiotropic gene candidates identified in the hematopoiesis-related gene modules (Velten et al., 2017). Table S22. Pleiotropic variants linked to RNY-containing loci. Table S23. GWAS-catalog cancer risk associations linked to RNY-containing loci (chromosomes 1-22). Table S24. Regulatory marks enriched in the 5' and 3' TSS regions of the pleiotropic RNY relative to non-pleiotropic RNY loci. Table S25. SLE risk variants (GWAS) correlated with blood-cancer pleiotropic variants in RNY-containing loci.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Inflation factor of all single studies were negligible (maximum λ = 1.014 in the Sorbs cohort). No inflation of meta-analysis results was observed (λ = 0.99). Positions of SNPs are given according to HapMap2 CEU, Release 24, dbSNP-build 126, NCBI 36 as the reference panel. Effect directions are given for the coding allele. ß Beta coefficient of regression model.Chr: chromosome; β: beta; SE: standard error; LRRC61: leucine rich repeat containing 61; ACTR3C: ARP3 actin-related protein 3 homolog C; RARRES2: retionic acid receptor responder 2; REPIN1: replication initiator 1; ZBED6CL: ZBED6 C-terminal like.SNPs associated with circulating chemerin levels at genome-wide significant levels.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Using principal component (PC) analysis, we studied the genetic constitution of 3,112 individuals from Europe as portrayed by more than 270,000 single nucleotide polymorphisms (SNPs) genotyped with the Illumina Infinium platform. In cohorts where the sample size was >100, one hundred randomly chosen samples were used for analysis to minimize the sample size effect, resulting in a total of 1,564 samples. This analysis revealed that the genetic structure of the European population correlates closely with geography. The first two PCs highlight the genetic diversity corresponding to the northwest to southeast gradient and position the populations according to their approximate geographic origin. The resulting genetic map forms a triangular structure with a) Finland, b) the Baltic region, Poland and Western Russia, and c) Italy as its vertexes, and with d) Central- and Western Europe in its centre. Inter- and intra- population genetic differences were quantified by the inflation factor lambda (λ) (ranging from 1.00 to 4.21), fixation index (Fst) (ranging from 0.000 to 0.023), and by the number of markers exhibiting significant allele frequency differences in pair-wise population comparisons. The estimated lambda was used to assess the real diminishing impact to association statistics when two distinct populations are merged directly in an analysis. When the PC analysis was confined to the 1,019 Estonian individuals (0.1% of the Estonian population), a fine structure emerged that correlated with the geography of individual counties. With at least two cohorts available from several countries, genetic substructures were investigated in Czech, Finnish, German, Estonian and Italian populations. Together with previously published data, our results allow the creation of a comprehensive European genetic map that will greatly facilitate inter-population genetic studies including genome wide association studies (GWAS).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Results are for GWAS analysis in GenABEL using each individual BMI reading as a separate observation, modelling the correlation between readings via the estimated kinship and using the Genomic Control deflation factor to avoid inflation of the overall distribution of test statistics. Results are for allele-wise tests under an additive model of inheritance for genotyped SNPs and for imputed data. Bold indicates the top hit for BMI based on both genotyped and 1000G imputed data. Full lists of the top 50 hits for genotyped SNPs, and the top 100 SNPs for imputed data, appear in S3 Table and S6 Table, respectively.* Genes separated by forward slash indicate nearest protein coding genes upstream/downstream of the SNP. NCBI37 = bp location on chromosome for NCBI Build 37. A1 = major allele; A2 = minor allele.Top GWAS SNP hits in genes of functional relevance for BMI, organized by chromosome.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Full name of each phenotype is discussed in Material and Methods section. GCTA software is utilized to estimate the phenotypic variance and its standard error for each phenotype.