Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains GWAS summary statistics for Standing Height in the UK Biobank.
The GWAS study used data from "White British" samples (N = 337225), which were randomly divided into 5 folds for the purposes of cross-validation. The upload contains, for each fold, GWAS summary statistics for the training and test set. The test summary statistics can be used to evaluate PRS models via pseudo-validation methods. Association testing was done with plink2.
The structure of the data is as follows:
For more details about the GWAS study, Quality Control (QC) criteria, or other information, please consult our publication:
Zabad, S., Gravel, S., & Li, Y. (2023). Fast and accurate Bayesian polygenic risk modeling with variational inference. The American Journal of Human Genetics, 110(5), 741–761. https://doi.org/10.1016/j.ajhg.2023.03.009
If you use this data in your work, please cite the publication above.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Quantile regression (QR) GWAS summary statistics from the study "Genome-wide discovery for biomarkers using quantile regression at biobank scale". The preprint is available at https://doi.org/10.1101/2023.06.05.543699.
List of traits
A comma-delimited text file, QRGWAS.Traits_n39.csv, includes the list of 39 quantitative traits from the UK Biobank reported in the QR GWAS analyses above.
Summary statistics
The tab-delimited text files are QR GWAS summary statistics, which are bgzip compressed (.tsv.gz files) and tabix indexed (.tbi files).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Abstract
Brain ageing is a highly variable, spatially and temporally heterogeneous process, marked by numerous structural and functional changes. These can cause discrepancies between individuals’ chronological age and the apparent age of their brain, as inferred from neuroimaging data. Machine learning models, and particularly Convolutional Neural Networks (CNNs), have proven adept in capturing patterns relating to ageing induced changes in the brain. The differences between the predicted and chronological ages, referred to as brain age deltas, have emerged as useful biomarkers for exploring those factors which promote accelerated ageing or resilience, such as pathologies or lifestyle factors. However, previous studies rely only on structural neuroimaging for predictions, overlooking potentially informative functional and microstructural changes. Here we show that multiple contrasts derived from different MRI modalities can predict brain age, each encoding bespoke brain ageing information. By using 3D CNNs and UK Biobank data, we found that 57 contrasts derived from structural, susceptibility-weighted, diffusion, and functional MRI can successfully predict brain age. For each contrast, different patterns of association with non-imaging phenotypes were found, resulting in a total of 191 unique, statistically significant associations. Furthermore, we found that ensembling data from multiple contrasts results in both higher prediction accuracies and stronger correlations to non-imaging measurements. Our results demonstrate that other 3D contrasts and modalities, which have not been considered so far for the task of brain age prediction, encode different information about the ageing brain. We envision our work as being the starting point for future investigations into the causal links underpinning the observed brain age deltas and non-imaging measurement associations. For instance, drug effects can be monitored, given that certain medications correlated with accelerated brain ageing. Furthermore, continued development of brain age models could facilitate their deployment in clinical trials for recruitment and monitoring, and hospitals for diagnostic and screening tasks.
Data Description
This dataset contains the full correlation results with all nIDPs in the UK Biobank. These are presented in datasets split by sex in Female and Male subjects. For easier data manipulation, two smaller datasets have also been made available, containing just those correlation which pass the False Discovery Rate (FDR) threshold.
As experiments were also conducted for ensembles using multiple contrasts, similar datasets are provided for those.
Finally, global datasets are also provided. These are the concatenation of the associations contained in the Male and Female datasets.
Paper & Code
The original paper for this article can be accessed here:
To access the codes relevant for this project, please access the project GitHub Repos:
If using this work, please cite it based on the above paper, or using the following BibTex:
@inproceedings{roibu2023brain,
title={Brain Ages Derived from Different MRI Modalities are Associated with Distinct Biological Phenotypes},
author={Roibu, Andrei-Claudiu and Adaszewski, Stanislaw and Schindler, Torsten and Smith, Stephen M and Namburete, Ana IL and Lange, Frederik J},
booktitle={2023 10th IEEE Swiss Conference on Data Science (SDS)},
pages={17--25},
year={2023},
organization={IEEE},
doi={10.1109/SDS57534.2023.00010}
}
Data Access
The data for this project is freely available upon application at the UK Biobank. For more information regarding the individual nIDPs, please access the UK Biobank Showcase website at: https://biobank.ctsu.ox.ac.uk/showcase/search.cgi
Funding
ACR is supported by EPSRC Grant EP/S024093/1, F. Hoffmann-La Roche AG and a 2021 Industrial Fellowship offered by the Royal Commission for the Exhibition of 1851. SMS is supported by a Wellcome Trust Collaborative Award 215573/Z/19/Z. AILN is grateful for support from the Academy of Medical Sciences under the Springboard Awards scheme (SBF005/1136), and the Bill and Melinda Gates Foundation. FJL is supported by a Wellcome Trust Collaborative Award (215573/Z/19/Z). The WIN is supported by core funding from the Wellcome Trust (203139/Z/16/Z). The computational aspects were supported by the Wellcome Trust (203141/Z/16/Z) and the NIHR Oxford BRC. Corresponding authors: ACR (andreiroibu@icloud.com), SA (stanislaw.adaszewski@roche.com) and AILN (ana.namburete@cs.ox.ac.uk).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This contains pre-processed LD files (Sigma matrix, S matrix, ...etc) computed on unrelated British samples of the UK-Biobank (n = 306604). It is intended to be used as an input to the GhostKnockoffGWAS pipeline.
Note: We previously released another set of EUR LD files. This set of LD files should be preferred over the previous one. The main difference with this entry is that the previous entry used quasi-independent blocks from LDetect computed on the 1000 genomes project. Here we compute the independent blocks using snp_ldsplit directly on the UK-Biobank British samples.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Statistically significant signals from the metaUSAT analysis are shown in the left-hand column. The central column shows the association p-values for those SNPs in the six original GWAS analyses, with the direction of effect indicated by a + or–sign. Candidate genes are those selected from the prioritised genes (using the four mapping strategies described previously for all GWAS-discovered loci) or genes in proximity as identified within the UCSC genome browser.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This contains pre-processed LD files (Sigma matrix, S matrix, ...etc) computed on the EUR cohort of Pan-UKB LD data. It is intended to be used as an input to the GhostKnockoffGWAS pipeline.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
We present a database of representative left and right ventricular meshes constructed from patient-specific models based on a large cohort of ~55k participants from UK Biobank. It comprises 1423 representative tetrahedral finite element meshes across sex (male, female), body mass index (range: 16 - 42 kg/m²) and age (range: 49 - 80 years).
For each mesh, it also includes:
We also present trained network weights and nnUNet plan and hyperparameter selection files for cine MR segmentation models trained separately for the following views: 2 chamber, 3 chamber, 4 chamber and short axis. These are supplied as a zip of relevant nnUNet files for each view: Dataset101_UKBB_LAX_2Ch.zip, Dataset102_UKBB_LAX_3Ch.zip, Dataset103_UKBB_LAX_4Ch.zip, Dataset100_UKBB_Petersen_SAX.zip.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Unique Data Identifier (UDI) codes (DOCX)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This project contains datasets related to:
Uncovering methylation-dependent genetic effects on regulatory element function in diverse genomes
Rachel M. Petersen, Christopher M. Vockley, Amanda J. Lea
A preprint of this work can be found here: https://www.biorxiv.org/content/10.1101/2024.08.23.609412v1
Specifically, the data provided here are:
1) replicateinfo.txt contains metadata for each mSTARR-seq replicate, including replicate number, pool number, sample type (DNA vs RNA) and methylation status
2) rnadnacounts_400bpwin.txt contains a count matrix with the number of DNA and RNA reads falling within each 400 bp genomic window for each replicate. Columns are replicate names, rows are genomic windows.
3) Joint_genotyping.vcf contains results from joint genotyping analysis using DNA sequences generated in the current study from 25 individuals accessed through the 1000 Genomes Project.
4) ASE_data.zip contains
5) model_results.zip contains
6) Comparison_datasets.zip contains
7) GWAS_EWAS_overlap_files.zip contains
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
GWAS summary statistics for major depressive disorder from the PGC
MDD2 (Wray et al.) excluding 23andMe and UK Biobank.
Cite Wray et al 2018 (source of cohort summary stastics) and Howard et al 2019 (source of UKB/PGC overlap resolution).
Update 2022/03/07
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This contains pre-processed LD files (Sigma matrix, S matrix, ...etc) computed on Caribbean samples of the UK-Biobank (n = 4517). It is intended to be used as an input to the GhostKnockoffGWAS pipeline.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This contains pre-processed LD files (Sigma matrix, S matrix, ...etc) computed on Chinese samples of the UK-Biobank (n = 1574). It is intended to be used as an input to the GhostKnockoffGWAS pipeline.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains all model weights and corresponding datasets generated by Betti et al. in the manuscript Genetically regulated enhancer RNA expression predicts enhancer-promoter contact frequency and reveals genetic mechanisms at complex trait-associated loci. The following are the contents of the sub-directories in this dataset:
Please cite:
Betti, M.J., Aldrich, M.C., Lin, P., & Gamazon, E.R. (2024). Genetically regulated enhancer RNA expression predicts enhancer-promoter contact frequency and reveals genetic mechanisms at complex trait-associated loci. Preprint.
Betti, M.J., Aldrich, M.C., Lin, P., & Gamazon, E.R. (2024). eRNA GReX (Version 1.0). Zenodo. 10.5281/zenodo.11212496
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These are the genome-wide association study (GWAS) statistics in the UK Biobank and Source Data files for our paper Chen ZJ, Das SS, Kar A, Lee SHT, Abuhanna KD, Alvarez M, Sukhatme MG, Wang Z, Gelev KZ, Heffel MG, Zhang Y, Avram O, Rahmani E, Sankararaman S, Heinonen S, Peltoniemi H, Halperin E, Pietiläinen KH, Luo C, Pajukanta P. Single-cell DNA methylome and 3D genome atlas of human subcutaneous adipose tissue.
Further details of these analyses can be found in the Methods and Results part of this paper.
Repository contents
GWAS summary statistics in the UK Biobank for C-reactive protein (CRP), body mass index (BMI), metabolic-dysfunction associated steatotic liver disease (MASLD), and waist-to-hip ratio adjusted for BMI (WHRadjBMI):
Figure source data:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains all model weights and corresponding datasets generated by Betti et al. in the manuscript Genetically regulated enhancer RNA expression predicts enhancer-promoter contact frequency and reveals genetic mechanisms at complex trait-associated loci. The following are the contents of the sub-directories in this dataset:
Please cite:
Betti, M.J., Aldrich, M.C., Lin, P., & Gamazon, E.R. (2024). Genetically regulated enhancer RNA expression predicts enhancer-promoter contact frequency and reveals genetic mechanisms at complex trait-associated loci. Preprint.
Betti, M.J., Aldrich, M.C., Lin, P., & Gamazon, E.R. (2024). eRNA GReX (Version 2.0). Zenodo. 10.5281/zenodo.14027849
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically