71 datasets found
  1. Source Data 5. The dataset derived from the UK Biobank for the G-E...

    • figshare.com
    application/gzip
    Updated Dec 12, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Han Zhang (2023). Source Data 5. The dataset derived from the UK Biobank for the G-E interaction analysis. [Dataset]. http://doi.org/10.6084/m9.figshare.24154983.v2
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Dec 12, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Han Zhang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data contains of the information on the mqtls of smoking-related methylation and is used to perform the G-E interaction analysis (for CD).

  2. E

    UK Biobank

    • healthinformationportal.eu
    html
    Updated Mar 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UK Biobank (2023). UK Biobank [Dataset]. https://www.healthinformationportal.eu/health-information-sources/uk-biobank
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Mar 31, 2023
    Dataset authored and provided by
    UK Biobank
    Variables measured
    sex, title, topics, country, funding, language, data_owners, description, sample_size, age_range_to, and 17 more
    Measurement technique
    Population data
    Dataset funded by
    Medical Research Councilhttp://mrc.ukri.org/
    Wellcome Trusthttps://wellcome.org/
    Department of Health and Social Carehttps://gov.uk/dhsc
    Description

    The objective of UK Biobank is to create a large-scale biomedical database and research resource, containing in-depth genetic and health information from half a million UK participants, which will contribute to the advancement of modern medicine, treatment and scientific discoveries that improve human health.

    Lifestyle and environmental information, medical history, physical measurements, and biological samples are being collected from about 500,000 people aged 40-69 at presentation and then, with consent, their health will be followed for many years through medical and other health related records. The biological samples are stored so that they can be used for a wide range of biochemical and genetic analyses in the future.

  3. f

    Data collected at the baseline assessment.

    • plos.figshare.com
    xls
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cathie Sudlow; John Gallacher; Naomi Allen; Valerie Beral; Paul Burton; John Danesh; Paul Downey; Paul Elliott; Jane Green; Martin Landray; Bette Liu; Paul Matthews; Giok Ong; Jill Pell; Alan Silman; Alan Young; Tim Sprosen; Tim Peakman; Rory Collins (2023). Data collected at the baseline assessment. [Dataset]. http://doi.org/10.1371/journal.pmed.1001779.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    PLOS Medicine
    Authors
    Cathie Sudlow; John Gallacher; Naomi Allen; Valerie Beral; Paul Burton; John Danesh; Paul Downey; Paul Elliott; Jane Green; Martin Landray; Bette Liu; Paul Matthews; Giok Ong; Jill Pell; Alan Silman; Alan Young; Tim Sprosen; Tim Peakman; Rory Collins
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description
    • assessed in 170,000 participants;† assessed in 50,000 participants;‡measured in one heel for 170,000 participants and in both heels for 320,000 participants;¶ measured in 170,000 participants;§ measured in 100,000 participantsData collected at the baseline assessment.
  4. Z

    GWAS on self-reported hearing difficulty in the UK Biobank

    • data.niaid.nih.gov
    • explore.openaire.eu
    • +1more
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wells, H.R.R. (2020). GWAS on self-reported hearing difficulty in the UK Biobank [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3490749
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Freidin, M.B.
    Morton, C.C.
    Zainul Abidin, F.N.
    Moore, D.R.
    Payton, A.
    Wells, H.R.R.
    Dawson, S.J.
    Williams, F.M.K.
    Dawes, P
    Munro, K.J.
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset contains results of two genome-wide association studies for age-related hearing impairment (ARHI)-related traits as described in the following publication

    Wells HRR, Freidin MB, Zainul Abidin FN, Payton A, Dawes P, Munro KJ, Morton CC, Moore DR, Dawson SJ, Williams FMK. GWAS Identifies 44 Independent Associated Genomic Loci for Self-Reported Adult Hearing Difficulty in UK Biobank. Am J Hum Genet. 2019 Oct 3;105(4):788-802. doi: 10.1016/j.ajhg.2019.09.008. Epub 2019 Sep 26.

    Please cite the article if using this dataset.

    Two files provide summary statistics for discovery analysis of Hearing difficulty (HD) and Hearing aid use (HAID) phenotypes for individuals of European descent from UK Biobank.

    Acknowledgements

    The research was carried out using the UK Biobank Resource under application number 11516. H.R.R.W. is funded by a PhD Studentship Grant, S44, from Action on Hearing Loss. The study was also supported by funding from NIHR UCLH BRC Deafness and Hearing Problems Theme, a grant from MED_EL, and the NIHR Manchester Biomedical Research Centre. The English Longitudinal Study of Aging is jointly run by University College London, Institute for Fiscal Studies, University of Manchester, and National Centre for Social Research. Genetic analyses have been carried out by UCL Genomics and funded by the Economic and Social Research Council and the National Institute on Aging. Data governance was provided by the METADAC data access committee, funded by ESRC, Wellcome, and MRC (2015-2018: Grant Number MR/N01104X/1 2018-2020: Grant Number ES/S008349/1). TwinsUK is funded by the Wellcome Trust, Medical Research Council, European Union, the National Institute for Health Research (NIHR)-funded BioResource, Clinical Research Facility, and Biomedical Research Centre based at Guy’s and St Thomas’ NHS Foundation Trust in partnership with King’s College London. We would like to thank all the participants of UK Biobank, English Longitudinal Study of Aging, and TwinsUK.

    Column headers:

    SNP, SNP rsID

    CHR, chromosome

    BP, genomic position (GRCh37 build)

    ALLELE1, effect allele (coded as "1")

    ALLELE0, reference allele (coded as "0")

    A1FREQ, effect allele frequency

    INFO, imputation quality

    BETA, effect size of effect allele

    SE: standard error of effect size

    P, P-value of association (without GC correction)

  5. Synthetic datasets of the UK Biobank cohort

    • zenodo.org
    • data.niaid.nih.gov
    bin, csv, pdf, zip
    Updated Feb 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Antonio Gasparrini; Antonio Gasparrini; Jacopo Vanoli; Jacopo Vanoli (2025). Synthetic datasets of the UK Biobank cohort [Dataset]. http://doi.org/10.5281/zenodo.13983170
    Explore at:
    bin, csv, zip, pdfAvailable download formats
    Dataset updated
    Feb 6, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Antonio Gasparrini; Antonio Gasparrini; Jacopo Vanoli; Jacopo Vanoli
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository stores synthetic datasets derived from the database of the UK Biobank (UKB) cohort.

    The datasets were generated for illustrative purposes, in particular for reproducing specific analyses on the health risks associated with long-term exposure to air pollution using the UKB cohort. The code used to create the synthetic datasets is available and documented in a related GitHub repo, with details provided in the section below. These datasets can be freely used for code testing and for illustrating other examples of analyses on the UKB cohort.

    Note: while the synthetic versions of the datasets resemble the real ones in several aspects, the users should be aware that these data are fake and must not be used for testing and making inferences on specific research hypotheses. Even more importantly, these data cannot be considered a reliable description of the original UKB data, and they must not be presented as such.

    The original datasets are described in the article by Vanoli et al in Epidemiology (2024) (DOI: 10.1097/EDE.0000000000001796) [freely available here], which also provides information about the data sources.

    The work was supported by the Medical Research Council-UK (Grant ID: MR/Y003330/1).

    Content

    The series of synthetic datasets (stored in two versions with csv and RDS formats) are the following:

    • synthbdcohortinfo: basic cohort information regarding the follow-up period and birth/death dates for 502,360 participants.
    • synthbdbasevar: baseline variables, mostly collected at recruitment.
    • synthpmdata: annual average exposure to PM2.5 for each participant reconstructed using their residential history.
    • synthoutdeath: death records that occurred during the follow-up with date and ICD-10 code.

    In addition, this repository provides these additional files:

    • codebook: a pdf file with a codebook for the variables of the various datasets, including references to the fields of the original UKB database.
    • asscentre: a csv file with information on the assessment centres used for recruitment of the UKB participants, including code, names, and location (as northing/easting coordinates of the British National Grid).
    • Countries_December_2022_GB_BUC: a zip file including the shapefile defining the boundaries of the countries in Great Britain (England, Wales, and Scotland), used for mapping purposes [source].

    Generation of the synthetic data

    The datasets resemble the real data used in the analysis, and they were generated using the R package synthpop (www.synthpop.org.uk). The generation process involves two steps, namely the synthesis of the main data (cohort info, baseline variables, annual PM2.5 exposure) and then the sampling of death events. The R scripts for performing the data synthesis are provided in the GitHub repo (subfolder Rcode/synthcode).

    The first part merges all the data including the annual PM2.5 levels in a single wide-format dataset (with a row for each subject), generates a synthetic version, adds fake IDs, and then extracts (and reshapes) the single datasets. In the second part, a Cox proportional hazard model is fitted on the original data to estimate risks associated with various predictors (including the main exposure represented by PM2.5), and then these relationships are used to simulate death events in each year. Details on the modelling aspects are provided in the article.

    This process guarantees that the synthetic data do not hold specific information about the original records, thus preserving confidentiality. At the same time, the multivariate distribution and correlation across variables as well as the mortality risks resemble those of the original data, so the results of descriptive and inferential analyses are similar to those in the original assessments. However, as noted above, the data are used only for illustrative purposes, and they must not be used to test other research hypotheses.

  6. TwinsUK

    • healthdatagateway.org
    unknown
    Updated Dec 30, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TwinsUK is funded by the Wellcome Trust, Medical Research Council, Versus Arthritis, European Union Horizon 2020, Chronic Disease Research Foundation (CDRF), Zoe Global Ltd and the National Institute for Health Research (NIHR)-funded BioResource, Clinical Research Facility and Biomedical Research Centre based at Guy’s and St Thomas’ NHS Foundation Trust in partnership with King’s College London. (2021). TwinsUK [Dataset]. https://healthdatagateway.org/dataset/728
    Explore at:
    unknownAvailable download formats
    Dataset updated
    Dec 30, 2021
    Dataset provided by
    TwinsUKhttp://www.twinsuk.ac.uk/
    Wellcome Trusthttps://wellcome.org/
    National Institute for Health and Care Research
    Authors
    TwinsUK is funded by the Wellcome Trust, Medical Research Council, Versus Arthritis, European Union Horizon 2020, Chronic Disease Research Foundation (CDRF), Zoe Global Ltd and the National Institute for Health Research (NIHR)-funded BioResource, Clinical Research Facility and Biomedical Research Centre based at Guy’s and St Thomas’ NHS Foundation Trust in partnership with King’s College London.
    License

    https://twinsuk.ac.uk/resources-for-researchers/access-our-data/https://twinsuk.ac.uk/resources-for-researchers/access-our-data/

    Description

    The TwinsUK cohort (https://twinsuk.ac.uk/), set up in 1992, is a major volunteer-based genomic epidemiology resource with longitudinal deep genomic and phenomics data from over 15,000 adult twins (18+) from across the UK who are highly engaged and recallable. The cohort is predominantly female (80%) for historical reasons. It is one of the most deeply characterised adult twin cohort in the world, providing a rich platform for scientists to research health and ageing longitudinally. There are over 700,000 biological samples stored and data collected on twins with repeat measures at multiple timepoints. Extremely large datasets (billions of data points) have been generated for each TwinsUK participant over 30 years, including phenotypes from questionnaires, multiple clinical visits, and record linkage, and genetic and ‘omic data from biological samples. TwinsUK ensures derived datasets from raw data are returned by collaborators to enhance the resource. TwinsUK also holds a wide range of laboratory samples, including plasma, serum, DNA, faecal microbiome and tissue (skin, fat, colonic biopsies) within HTA-regulated facilities at King's College London.

    More recently, postal and at-home collection strategies have allowed sample collections from frail twins, our whole cohort for COVID-19 studies, and for new twin recruits. The cohort is recallable either on a four-year longitudinal sweep visit or, based on diagnosis or genotype.

    More than 1,000 data access collaborations and 250,000 samples have been shared with external researchers, resulting in over 800 publications since 2012.

    TwinsUK is now working to link to twins’ official health, education and environmental records for health research purposes, which will further enhance the resource, education and environmental records for health research purposes, which will further enhance the resource.

  7. Data from: Brain Ages Derived from Different MRI Modalities are Associated...

    • zenodo.org
    • data.niaid.nih.gov
    csv
    Updated Apr 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrei-Claudiu Roibu; Andrei-Claudiu Roibu; Stanislaw Adaszewski; Torsten Schindler; Stephen M. Smith; Stephen M. Smith; Ana I.L. Namburete; Ana I.L. Namburete; Frederik J. Lange; Frederik J. Lange; Stanislaw Adaszewski; Torsten Schindler (2025). Brain Ages Derived from Different MRI Modalities are Associated with Distinct Biological Phenotypes [Dataset]. http://doi.org/10.5281/zenodo.8110876
    Explore at:
    csvAvailable download formats
    Dataset updated
    Apr 24, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Andrei-Claudiu Roibu; Andrei-Claudiu Roibu; Stanislaw Adaszewski; Torsten Schindler; Stephen M. Smith; Stephen M. Smith; Ana I.L. Namburete; Ana I.L. Namburete; Frederik J. Lange; Frederik J. Lange; Stanislaw Adaszewski; Torsten Schindler
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Abstract

    Brain ageing is a highly variable, spatially and temporally heterogeneous process, marked by numerous structural and functional changes. These can cause discrepancies between individuals’ chronological age and the apparent age of their brain, as inferred from neuroimaging data. Machine learning models, and particularly Convolutional Neural Networks (CNNs), have proven adept in capturing patterns relating to ageing induced changes in the brain. The differences between the predicted and chronological ages, referred to as brain age deltas, have emerged as useful biomarkers for exploring those factors which promote accelerated ageing or resilience, such as pathologies or lifestyle factors. However, previous studies rely only on structural neuroimaging for predictions, overlooking potentially informative functional and microstructural changes. Here we show that multiple contrasts derived from different MRI modalities can predict brain age, each encoding bespoke brain ageing information. By using 3D CNNs and UK Biobank data, we found that 57 contrasts derived from structural, susceptibility-weighted, diffusion, and functional MRI can successfully predict brain age. For each contrast, different patterns of association with non-imaging phenotypes were found, resulting in a total of 191 unique, statistically significant associations. Furthermore, we found that ensembling data from multiple contrasts results in both higher prediction accuracies and stronger correlations to non-imaging measurements. Our results demonstrate that other 3D contrasts and modalities, which have not been considered so far for the task of brain age prediction, encode different information about the ageing brain. We envision our work as being the starting point for future investigations into the causal links underpinning the observed brain age deltas and non-imaging measurement associations. For instance, drug effects can be monitored, given that certain medications correlated with accelerated brain ageing. Furthermore, continued development of brain age models could facilitate their deployment in clinical trials for recruitment and monitoring, and hospitals for diagnostic and screening tasks.

    Data Description

    This dataset contains the full correlation results with all nIDPs in the UK Biobank. These are presented in datasets split by sex in Female and Male subjects. For easier data manipulation, two smaller datasets have also been made available, containing just those correlation which pass the False Discovery Rate (FDR) threshold.

    As experiments were also conducted for ensembles using multiple contrasts, similar datasets are provided for those.

    Finally, global datasets are also provided. These are the concatenation of the associations contained in the Male and Female datasets.

    Paper & Code

    The original paper for this article can be accessed here:

    To access the codes relevant for this project, please access the project GitHub Repos:

    If using this work, please cite it based on the above paper, or using the following BibTex:

    @inproceedings{roibu2023brain,
     title={Brain Ages Derived from Different MRI Modalities are Associated with Distinct Biological Phenotypes},
     author={Roibu, Andrei-Claudiu and Adaszewski, Stanislaw and Schindler, Torsten and Smith, Stephen M and Namburete, Ana IL and Lange, Frederik J},
     booktitle={2023 10th IEEE Swiss Conference on Data Science (SDS)},
     pages={17--25},
     year={2023},
     organization={IEEE},
     doi={10.1109/SDS57534.2023.00010}
    }

    Data Access

    The data for this project is freely available upon application at the UK Biobank. For more information regarding the individual nIDPs, please access the UK Biobank Showcase website at: https://biobank.ctsu.ox.ac.uk/showcase/search.cgi

    Funding

    ACR is supported by EPSRC Grant EP/S024093/1, F. Hoffmann-La Roche AG and a 2021 Industrial Fellowship offered by the Royal Commission for the Exhibition of 1851. SMS is supported by a Wellcome Trust Collaborative Award 215573/Z/19/Z. AILN is grateful for support from the Academy of Medical Sciences under the Springboard Awards scheme (SBF005/1136), and the Bill and Melinda Gates Foundation. FJL is supported by a Wellcome Trust Collaborative Award (215573/Z/19/Z). The WIN is supported by core funding from the Wellcome Trust (203139/Z/16/Z). The computational aspects were supported by the Wellcome Trust (203141/Z/16/Z) and the NIHR Oxford BRC. Corresponding authors: ACR (andreiroibu@icloud.com), SA (stanislaw.adaszewski@roche.com) and AILN (ana.namburete@cs.ox.ac.uk).

  8. D

    Data from: Sociability GWAS in a population-based sample

    • lifesciences.datastations.nl
    • narcis.nl
    pdf, zip
    Updated May 4, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    J.B. Bralten; N. Roth Mota; C.J.H.M. Klemann; de de Witte; J.B. Bralten; N. Roth Mota; C.J.H.M. Klemann; de de Witte (2021). Sociability GWAS in a population-based sample [Dataset]. http://doi.org/10.17026/DANS-ZTJ-ZGA6
    Explore at:
    zip(244782791), zip(24497), pdf(114065)Available download formats
    Dataset updated
    May 4, 2021
    Dataset provided by
    DANS Data Station Life Sciences
    Authors
    J.B. Bralten; N. Roth Mota; C.J.H.M. Klemann; de de Witte; J.B. Bralten; N. Roth Mota; C.J.H.M. Klemann; de de Witte
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Levels of sociability are continuously distributed in the general population, and decreased sociability represents an early manifestation of several brain disorders. Here, we investigated the genetic underpinnings of sociability in the population.Main question of our research: 1. Are there common genetic variants that are associated with sociability in the general population? 2. Are genetic variants that are associated with sociability also associated with neuropsychiatric disorders?Type of data uploaded in this repository:The UK Biobank project (see https://www.ukbiobank.ac.uk/) is a large-scale biomedical database and research resource, containing in-depth genetic and health information from half a million UK participants. The database is globally accessible to approved researchers undertaking vital research into the most common and life-threatening diseases. The raw data that this project is based on comes from the publically available UK Biobank set, which is very large and is therefore not provided here. Here we only provide the results from our analysis, that is also described here: https://www.biorxiv.org/content/10.1101/781195v2 and currently in revision in a scientific journal. In the dataset you will find the association of 9327396 genetic variants with the phenotype sociability. This dataset is not applicable to be opened with Excel, and can best be opened on a cluster computer or using specfic software.SubjectsThe UK Biobank (UKBB) is a major population-based cohort from the United Kingdom that includes individuals aged between 37 and 73 years. We constructed a sociability measure based on the the aggregation of scores per participant on four questions from the UKBB database that link to sociability, including (1) a question about the frequency of friend/family visits, (2) a question on the number and type of social venues that are visited, (3) a question about worrying after social embarrassment and (4) a question about feeling lonely, leading to a sociability score ranging from 0-4. Participants were excluded if they had somatic problems that could be related to social withdrawal (BMI < 15 or BMI > 40, narcolepsy (all the time), stroke, severe tinnitus, deafness or brain-related cancers) or if they answered that they had “No friends/family outside household” or “Do not know” or “Prefer not to answer” to any of the questions.SNP genotyping and quality controlDetails about the available genome-wide genotyping data for UKBB participants have been reported previously (PMID: 30305743). We used third-release genotyping data (see https://biobank.ctsu.ox.ac.uk/crystal/label.cgi?id=100319). Briefly, 49,950 participants were genotyped using the UK BiLEVE Axiom Array and 438,427 participants were genotyped using UK Biobank Axiom Array. Genotypes were imputed into the dataset using the Haplotype Reference Consortium (HRC), and the UK10K haplotype resource. To account for ethnicity, we included only those individuals that identified themselves as "white" by self-report and plotted the Principal Components (PC) provided by the UKBB, excluding individuals considered to be outliers according to PCs 1 and 2. Genetic relatedness calculated with KING kinship and provided by the UKBB (https://kenhanscombe.github.io/ukbtools/articles/explore-ukb-data.html ; http://www.ukbiobank.ac.uk/wp-content/uploads/2014/04/UKBiobank_genotyping_QC_documentation-web.pdf) was used to identify first and second-degree relatives. Subsequently ´families´ (i.e. clusters of related individuals above an IBD>0.125 threshold) were created and only one individual from each of these created ‘families’ was included in the analysis. If self-reported sex and SNP-based sex differed, individuals were excluded from further analysis. Single nucleotide polymorphisms (SNPs) with minor allele frequency <0.005, Hardy-Weinberg equilibrium test P value<1e−6, missing genotype rate >0.05, and imputation quality of INFO <0.8 were excluded. In the current study, all analyses are based on 342,461 participants of European ancestry for which both genotype data and sociability scores were available.Genome-wide association analysisGenome-wide association analysis with the imputed marker dosages was performed in PLINK1.9, using a linear regression model with the sociability measure as the dependent variable and including sex, age, 10 first PCs, assessment center, and genotype batch as covariates. SNPs were considered significantly associated if they had p-value < 5e-8. Associated loci were considered independent of each other at r2 0.6 and lead SNPs were classified as the SNP with the smallest association p-value and at r2 0.1, using a 250kb window.The summary statistics come from the plink2 linear regression analysis.

  9. Source Data 2. The dataset derived from the UK Biobank for the cohort study....

    • figshare.com
    application/gzip
    Updated Dec 12, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Han Zhang (2023). Source Data 2. The dataset derived from the UK Biobank for the cohort study. [Dataset]. http://doi.org/10.6084/m9.figshare.24154980.v2
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Dec 12, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Han Zhang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data is used to conduct cohort study to evaluate the association between smoking and the risk of inflammatory bowel disease.

  10. D

    Data from: Genome-wide association study of nociceptive musculoskeletal pain...

    • lifesciences.datastations.nl
    • explore.openaire.eu
    application/gzip, pdf +1
    Updated Jan 11, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    S. Li; G.J.V. Poelmans; R.L.M. van Boekel; M.J.H. Coenen; S. Li; G.J.V. Poelmans; R.L.M. van Boekel; M.J.H. Coenen (2022). Data from: Genome-wide association study of nociceptive musculoskeletal pain treatment response in UK Biobank [Dataset]. http://doi.org/10.17026/DANS-XNS-UN6C
    Explore at:
    zip(25047), pdf(85206), application/gzip(242906210)Available download formats
    Dataset updated
    Jan 11, 2022
    Dataset provided by
    DANS Data Station Life Sciences
    Authors
    S. Li; G.J.V. Poelmans; R.L.M. van Boekel; M.J.H. Coenen; S. Li; G.J.V. Poelmans; R.L.M. van Boekel; M.J.H. Coenen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Drug treatment for nociceptive musculoskeletal pain (NMP) follows a three-step analgesic ladder, starting from non-steroidal anti-inflammatory drugs (NSAIDs), followed by weak or strong opioids until the pain is under control. Here, we conducted a genome-wide association study (GWAS) of a binary phenotype comparing NSAID users and opioid users as a proxy of treatment response to NSAID using data from the UK Biobank. We aim to find the common genetic variants associated with pain treatment response in the general population.Type of data uploaded in this repositoryUK Biobank is a large-scale biomedical database and research resource containing in-depth genetic and health information from half a million UK participants (https://www.ukbiobank.ac.uk/). The database is globally accessible to approved researchers undertaking vital research into the most common and life-threatening diseases. As the raw data is quite large and only available upon application to UKB, we only provide the results from our analysis, which is also described here: medrxiv and currently in revision in a scientific journal. In the dataset, you will find the association of 9,435,994 SNPs genetic variants with the pain treatment response (PTR) phenotype. This dataset is not applicable to be opened with Excel and can best be opened on a cluster computer or using specific software.SubjectsThe UK Biobank is a general population cohort with over 0.5 million participants aged 40–69 recruited across the United Kingdom (UK). We derived a phenotype as a proxy for the pain treatment response to NSAIDs by using recently released primary care (general practitioners', GPs') data, which contains longitudinal structured diagnosis and prescription data. To define the PTR phenotype, we first extracted all nociceptive musculoskeletal pain (NMP) treatments and diagnoses from the GP data. NMP diagnosis was primarily selected from the chapters on musculoskeletal and connective tissue diseases and relevant symptoms or signs from other chapters in the Read codes (versions 2 and 3). See Supplementary data 1 on medrxiv for the diagnosis codes included in this study. Secondly, pain prescriptions (NSAID and opioid) were extracted from the GP data using the British national formulary (BNF), dictionary of medicines and devices (dmd), and Read code (version 2) for data extraction. An overview of the extracted medication codes is provided in Supplementary data 2 on medrxiv. Only participants with an NMP diagnosis record and a pain prescription record occurring on the same date were included for analysis to ensure that we would only include pain treatment for NMP.PhenotypeBased on the information of NMP and pain prescriptions from the UK biobank, a dichotomous score was used for the binary (case/control) PTR phenotype: NSAID users were defined as controls and opioid users as cases. Two additional quality control (QC) steps were applied. First, participants with only one treatment event were removed to safeguard the inclusion of only participants with relatively long-term treatment. Second, a chronological check was applied for the first prescription of each ladder to ensure that the treatment ladder was correctly followed, i.e., initial NSAID use was followed by weak or strong opioids. Participants that were not treated according to this order were removed.SNP genotyping and quality controlGenotyping procedures have been described in detail elsewhere [PMID: 30305743].The third-release genotyping data were used for analysis (see https://biobank.ctsu.ox.ac.uk/crystal/label.cgi?id=100319).Participants passing quality control were included for analysis. QC steps for the samples included removal of participants with (1) inconsistent self-reported and genetically determined sex, (2) missing individual genetic data with a frequency of more than 0.1, (3) putative sex-chromosome aneuploidy. Participants were also excluded from the analysis if they were considered outliers due to missing heterozygosity, not white British ancestry based on the genotype, and had missing covariate data. Note that when we fit the linear mixed model in GCTA, it reminded us that the number of closely related participants was low. Therefore, we didn't further remove the related individuals in the sample.Routine QC steps for genetic markers on autosomes included removal of single nucleotide polymorphisms (SNPs) with (1) an imputation quality score less than 0.8, (2) a minor allele frequency (MAF) less than 0.005, (3) a Hardy-Weinberg equilibrium (HWE) test P-value less than 1 × 10−6, and (4) a genotyping call rate less than 0.95.Genome-wide association analysisA GWAS for binary PTR phenotype was conducted using a linear function in GCTA [38] for markers on the autosomal chromosomes, adjusting for age, sex, BMI, depression history, smoking status, drinking frequency, assessment center, genotyping array, and the first ten principal components (PCs). The following variables from the UK Biobank data set...

  11. UK Biobank MGUS GWAS Results

    • zenodo.org
    application/gzip, png
    Updated Jul 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Murat Güler; Murat Güler; Federico Canzian; Federico Canzian (2024). UK Biobank MGUS GWAS Results [Dataset]. http://doi.org/10.5281/zenodo.10533713
    Explore at:
    png, application/gzipAvailable download formats
    Dataset updated
    Jul 7, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Murat Güler; Murat Güler; Federico Canzian; Federico Canzian
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    May 1, 2024
    Description

    The UK Biobank is a large-scale biomedical database and research resource, containing genetic and health information from half a million individuals aged 40 to 69 years in the United Kingdom. The genotyping methods and quality control steps previously reported (PMID: 30305743). The MGUS cases were defined by using cancer registry data (Data-Field 40011 and Data-Field 40006) by using ICD-10 code D47.2 and ICD-O-3 code 9765. The control group was created by removing participants who had any cancer-related record in either another cancer registry, hospital record, or self-reported history of cancer. From the whole cohort, participants were removed if they reported non-white British ethnic background, sex chromosome aneuploidy, genetic relatedness exclusions, recommended genomic analysis exclusions, genetic and reported sex mismatch. After exclusion steps, a total of 107 MGUS cases and 277496 controls were used for GWAS analysis. The association models in both steps also included the following covariates: age (cases: age at diagnosis, controls: age at recruitment), sex, genotyping array, and the first 10 genetic principal components (PCs).

    Column Names in the summary statistics:

    CHROM: Chromosome

    GENPOS: HRCH_37 position

    ID:rsid

    ALLELE0: Non-effect Allel

    ALLELE1: Effect Allel

    A1FREQ: Freq of effect allel

    A1FREQ_CASES

    A1FREQ_CONTROLS

    INFO

    N

    N_CASES

    N_CONTROLS

    TEST

    BETA

    SE

    CHISQ

    LOG10P

    EXTRA

    SNP: SNP names in Chr:Pos:A0:A1

    P

  12. Mapping the phenotype model.

    • plos.figshare.com
    xls
    Updated Jun 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nada AlMohaisen; Matthew Gittins; Chris Todd; Sorrel Burden (2023). Mapping the phenotype model. [Dataset]. http://doi.org/10.1371/journal.pone.0278371.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 6, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Nada AlMohaisen; Matthew Gittins; Chris Todd; Sorrel Burden
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Mapping the phenotype model.

  13. SNP and SNP-set results for low-density lipoprotein (LDL) cholesterol in ten...

    • plos.figshare.com
    zip
    Updated May 31, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pinar Demetci; Wei Cheng; Gregory Darnell; Xiang Zhou; Sohini Ramachandran; Lorin Crawford (2023). SNP and SNP-set results for low-density lipoprotein (LDL) cholesterol in ten thousand randomly sampled individuals of European ancestry from the UK Biobank. [Dataset]. http://doi.org/10.1371/journal.pgen.1009754.s055
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Pinar Demetci; Wei Cheng; Gregory Darnell; Xiang Zhou; Sohini Ramachandran; Lorin Crawford
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We analyze the same J = 394,174 SNPs and G = 18, 364 SNP-sets used in the Framingham Heart Study analyses. Here, SNP-set annotations are based on gene boundaries defined by the NCBI’s RefSeq database in the UCSC Genome Browser [50]. Unannotated SNPs located within the same genomic region were labeled as being within the “intergenic region” between two genes. This file gives the posterior inclusion probabilities (PIPs) for the input and hidden layer neural network weights after fitting the BANNs model on the individual-level data. We assess significance for both SNPs and SNP-sets according to the “median probability model” threshold 57. Page #1 provides the variant-level association mapping results with columns corresponding to: (1) chromosome; (2) SNP ID; (3) chromosomal position in base-pair (bp) coordinates; (4) SNP PIP; and (5) SuSiE PIP, which corresponds to SNP-level posterior inclusion probabilities computed by SuSiE [46]. Page #2 provides the SNP-set level enrichment results with columns corresponding to: (1) chromosome; (2) SNP-set ID; (3-4) the starting and ending position of the SNP-set chromosomal boundaries; (5) SNP-set PIP; (6) RSS PIP, which corresponds to the posterior inclusion probabilities computed by RSS [26]; (7) the number of SNPs that have been annotated within each SNP-set; (8) the “top” associated SNP within each SNP-set; (9) the PIP of each top SNP. Pages #3 and #4 provide similar results based on analyses where each SNP-set annotation has been augmented with a ±500 kilobase (kb) buffer to account for possible regulatory elements. (ZIP)

  14. Biobanking Market Analysis, Size, and Forecast 2024-2028: North America (US...

    • technavio.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Technavio, Biobanking Market Analysis, Size, and Forecast 2024-2028: North America (US and Canada), Europe (France, Germany, Italy, and UK), Middle East and Africa (Egypt, KSA, Oman, and UAE), APAC (China, India, and Japan), South America (Argentina and Brazil), and Rest of World (ROW) [Dataset]. https://www.technavio.com/report/biobanking-market-industry-analysis
    Explore at:
    Dataset provided by
    TechNavio
    Authors
    Technavio
    Time period covered
    2021 - 2025
    Area covered
    Saudi Arabia, United States, Canada, Germany, Global
    Description

    Snapshot img

    Biobanking Market Size 2024-2028

    The biobanking market size is forecast to increase by USD 1.67 billion, at a CAGR of 9.04% between 2023 and 2028.

    The market is experiencing significant growth, driven by the increasing demand for regenerative medicine. This trend is fueled by advancements in genetic research and the potential for customized treatment plans based on individual genetic profiles. Another key driver is the emergence of stem cell storage in biobanks and biopreservation, offering new opportunities for medical research and therapeutic applications. However, this market also faces challenges. Ethical issues surrounding the collection, storage, and use of biological samples remain a significant obstacle. Ensuring informed consent, privacy protection, and adherence to regulatory guidelines are essential for maintaining public trust and avoiding potential legal disputes.
    Companies seeking to capitalize on market opportunities must navigate these challenges effectively, while also staying abreast of technological advancements and evolving customer needs. Success in the market requires a strong commitment to ethical practices, innovative solutions, and strategic partnerships.
    

    What will be the Size of the Biobanking Market during the forecast period?

    Explore in-depth regional segment analysis with market size data - historical 2018-2022 and forecasts 2024-2028 - in the full report.
    Request Free Sample

    The market continues to evolve, driven by advancements in data management, sample collection, and research applications. Biobanks are increasingly integrating LIMS systems for efficient sample accessibility and inventory management. Forensic samples and microbial samples join the ranks of clinical and research specimens in biobanking, expanding its scope. Data analytics plays a crucial role in drug discovery and precision medicine, necessitating robust data security and access control. Ethical considerations, informed consent, and biobanking ethics remain paramount, shaping the industry's growth. Cell lines and audit trails are essential components of biobanking, ensuring transparency and traceability. Biobanking software facilitates sample availability and public health research, while temperature monitoring, humidity control, and predictive modeling optimize sample storage and processing.

    Biobank networks collaborate to share resources and expertise, fostering advancements in therapeutic development, biomarker discovery, and disease research. Intellectual property rights and metadata standards ensure data integrity and enable data sharing. Short-term and long-term storage solutions, including dry ice, liquid nitrogen, and cryogenic freezers, cater to various sample preservation requirements. Automated liquid handling and temperature monitoring systems streamline sample processing and enhance quality control. Biobanking's continuous dynamism is reflected in its applications across sectors, from clinical trials to public health, and its role in advancing research and therapeutic development.

    How is this Biobanking Industry segmented?

    The biobanking industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2024-2028, as well as historical data from 2018-2022 for the following segments.

    Type
    
      Physical
      Virtual
    
    
    Product
    
      Equipment
      Consumables
    
    
    End-User
    
      Pharmaceutical & Biotechnology Companies
      Academic & Research Institutions
      Hospitals
      Contract Research Organizations (CROs)
    
    
    Application
    
      Regenerative Medicine
      Life Science Research
      Clinical Research
      Drug Discovery & Development
      Personalized Medicine
    
    
    Sample Type
    
      Blood Products
      Human Tissues
      Cell Lines
      Nucleic Acids
      Biological Fluids
      Human Waste Products
    
    
    Biobank Type
    
      Population-Based Biobanks
      Disease-Based Biobanks
      Virtual Biobanks
      Tissue Biobanks
      Genetic Biobanks
    
    
    Geography
    
      North America
    
        US
        Canada
    
    
      Europe
    
        France
        Germany
        Italy
        UK
    
    
      Middle East and Africa
    
        Egypt
        KSA
        Oman
        UAE
    
    
      APAC
    
        China
        India
        Japan
    
    
      South America
    
        Argentina
        Brazil
    
    
      Rest of World (ROW)
    

    By Type Insights

    The physical segment is estimated to witness significant growth during the forecast period.

    Biobanks, as repositories for biological samples including human tissues, cells, blood, DNA, and other biomolecules, play a crucial role in research and medical applications. The physical segment of the market encompasses various types of biobanks, categorized by the nature of the samples. These include tissue biobanks, cell biobanks, and blood biobanks. The increasing emphasis on personalized medicine, which customizes treatments based on individual patients' genetic makeup and biomarkers, drives the demand for high-quality biological samples. Data management is

  15. r

    GeneATLAS

    • rrid.site
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). GeneATLAS [Dataset]. http://identifiers.org/RRID:SCR_017577
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    Database of associations between traits and variants using UK Biobank cohort. Searchable atlas of genetic associations. Assists researchers to query UK Biobank. Provides unbiased view of phenotype and genotype associations across of traits.

  16. Z

    Genome-wide association summary statistics for back pain

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yakov A Tsepilov (2020). Genome-wide association summary statistics for back pain [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1319331
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Frances MK Williams
    Melody Palmer
    Yurii S Aulchenko
    Yakov A Tsepilov
    CHARGE Musculoskeletal Working Group
    Maxim B Freidin
    Lennart Karssen
    Pradeep Suri
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset contains results of a genome-wide association study of back pain. Two files contain association summary statistics for discovery GWAS based on the analysis of 350,000 white British individuals from the UK Biobank and meta-analysis GWAS based on the meta-analysis of the same 350,000 individuals and additional 103,862 individuals of European Ancestry from the UK biobank (total N = 453,862). The phenotype of back pain was defined by the answer provided by the UK biobank participants to the following question: "Pain type(s) experienced in last month". Those who reported “Back pain”, were considered as cases, all the rest were considered as controls. Individuals who did not reply or replied: "Prefer not to answer" or "Pain all over the body" were excluded. This dataset is also available for graphical exploration in the genomic context at http://gwasarchive.org.

    The data are provided on an "AS-IS" basis, without warranty of any type, expressed or implied, including but not limited to any warranty as to their performance, merchantability, or fitness for any particular purpose. If investigators use these data, any and all consequences are entirely their responsibility. By downloading and using these data, you agree that you will cite the appropriate publication in any communications or publications arising directly or indirectly from these data; for utilisation of data available prior to publication, you agree to respect the requested responsibilities of resource users under 2003 Fort Lauderdale principles; you agree that you will never attempt to identify any participant. This research has been conducted using the UK Biobank Resource and the use of the data is guided by the principles formulated by the UK Biobank.

    When using downloaded data, please cite corresponding paper and this repository:

    Insight into the genetic architecture of back pain and its risk factors from a study of 509,000 individuals. Freidin, Maxim; Tsepilov, Yakov; Palmer, Melody; Karssen, Lennart; Suri, Pradeep; Aulchenko, Yurii; Williams, Frances MK,# CHARGE Musculoskeletal Working Group. PAIN: February 06, 2019 - Volume Articles in Press - Issue - p doi: 10.1097/j.pain.0000000000001514

    Maxim B Freidin, Yakov A Tsepilov, Melody Palmer, Lennart Karssen, CHARGE Musculoskeletal Working Group, Pradeep Suri, … Frances MK Williams. (2018). Genome-wide association summary statistics for back pain (Version 1) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.1319332

    Funding:

    This study was supported by the European Community’s Seventh Framework Programme funded project PainOmics (Grant agreement # 602736). The research has been conducted using the UK Biobank Resource (project # 18219).

    The development of software implementing SMR/HEIDI test and database for GWAS results was supported by the Russian Ministry of Science and Education under the 5-100 Excellence Program”.

    Dr. Suri’s time for this work was supported by VA Career Development Award # 1IK2RX001515 from the United States (U.S.) Department of Veterans Affairs Rehabilitation Research and Development Service. The contents of this work do not represent the views of the U.S. Department of Veterans Affairs or the United States Government.

    Dr. Tsepilov’s time for this work was supported in part by the Russian Ministry of Science and Education under the 5-100 Excellence Program.

    Column headers - discovery (350K)

    CHR: chromosome

    POS: position (GRCh37 build)

    ID: SNP rsID

    REF: reference allele (coded as "0")

    ALT: effect allele (coded as "1")

    CASE_ALLELE_CT: allele observation count in cases

    CTRL_ALLELE_CT: allele observation count in controls

    ALT_FREQ: effect allele frequency

    MACH_R2: imputation quality

    TEST: model of association test (additive)

    OBS_CT: sample size

    BETA: effect size of effect allele

    SE: standard error of effect size

    T_STAT: Z-value of effect allele

    P: P-value of association (without GC correction)

    MAF: minor allele frequency

    Column headers - meta-analysis (450K)

    MarkerName: SNP rsID

    Allele1: effect allele (coded as "1")

    Allele2: reference allele (coded as "0")

    Freq1: effect allele frequency

    FreqSE: standard error of effect allele frequency

    Effect: effect size of effect allele

    StdErr: standard error of effect size

    P-value: P-value of association (without GC correction)

    Direction: sign of effect in discovery and replication samples

    n_total: Total sample size

    CHR: chromosome

    POS: position (GRCh37 build)

    MACH_R2_discovery: imputation quality in discovery sample

  17. h

    INTERVAL

    • healthdatagateway.org
    unknown
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    INTERVAL must be acknowledged in all publications using these data. Further details will be issued through the Data Access Committee., INTERVAL [Dataset]. https://healthdatagateway.org/dataset/201
    Explore at:
    unknownAvailable download formats
    Dataset authored and provided by
    INTERVAL must be acknowledged in all publications using these data. Further details will be issued through the Data Access Committee.
    License

    http://www.donorhealth-btru.nihr.ac.uk/wp-content/uploads/2020/04/Data-Access-Policy-v1.0-14Apr2020.pdfhttp://www.donorhealth-btru.nihr.ac.uk/wp-content/uploads/2020/04/Data-Access-Policy-v1.0-14Apr2020.pdf

    Description

    In over 100 years of blood donation practice, INTERVAL is the first randomised controlled trial to assess the impact of varying the frequency of blood donation on donor health and the blood supply. It provided policy-makers with evidence that collecting blood more frequently than current intervals can be implemented over two years without impacting on donor health, allowing better management of the supply to the NHS of units of blood with in-demand blood groups. INTERVAL was designed to deliver a multi-purpose strategy: an initial purpose related to blood donation research aiming to improve NHS Blood and Transplant’s core services and a longer-term purpose related to the creation of a comprehensive resource that will enable detailed studies of health-related questions.

    Approximately 50,000 generally healthy blood donors were recruited between June 2012 and June 2014 from 25 NHS Blood Donation centres across England. Approximately equal numbers of men and women; aged from 18-80; ~93% white ancestry. All participants completed brief online questionnaires at baseline and gave blood samples for research purposes. Participants were randomised to giving blood every 8/10/12 weeks (for men) and 12/14/16 weeks (for women) over a 2-year period. ~30,000 participants returned after 2 years and completed a brief online questionnaire and gave further blood samples for research purposes.

    The baseline questionnaire includes brief lifestyle information (smoking, alcohol consumption, etc), iron-related questions (e.g., red meat consumption), self-reported height and weight, etc. The SF-36 questionnaire was completed online at baseline and 2-years, with a 6-monthly SF-12 questionnaire between baseline and 2-years.

    All participants have had the Affymetrix Axiom UK Biobank genotyping array assayed and then imputed to 1000G+UK10K combined reference panel (80M variants in total). 4,000 participants have 50X whole-exome sequencing and 12,000 participants have 15X whole-genome sequencing. Whole-blood RNA sequencing has commenced in ~5,000 participants.

    The dataset also contains data on clinical chemistry biomarkers, blood cell traits, >200 lipoproteins, metabolomics (Metabolon HD4), lipidomics, and proteomics (SomaLogic, Olink), either cohort-wide or is large sub-sets of the cohort.

  18. European LD files for GhostKnockoffGWAS

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Feb 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Benjamin B Chu; Benjamin B Chu (2024). European LD files for GhostKnockoffGWAS [Dataset]. http://doi.org/10.5281/zenodo.10433663
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 20, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Benjamin B Chu; Benjamin B Chu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Feb 2024
    Description

    This contains pre-processed LD files (Sigma matrix, S matrix, ...etc) computed on the EUR cohort of Pan-UKB LD data. It is intended to be used as an input to the GhostKnockoffGWAS pipeline.

    • We restricted our attention to the EUR panel
    • We filtered the original HailBlockMatrix LD panel to genotypes that are typed (i.e. imputed SNPs were removed)
    • Coordinates in both hg19 and hg38 are available. Conversion from hg19 to hg38 were achieved by the R package liftOver.
    • Downloading and processing of the original HailBlockMatrix formatted data is accomplished by the EasyLD.jl software: https://biona001.github.io/EasyLD.jl
    • Knockoff optimization were carried out by the Knockoffs.jl julia package: https://github.com/biona001/Knockoffs.jl
    • The result (i.e. files available in this site) is saved in .csv and .h5 formatted files for easier access, which is directly readable by GhostKnockoffGWAS.

  19. Linkage-Disequilibrium (LD) matrices for six continental ancestry groups...

    • zenodo.org
    application/gzip
    Updated Jan 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shadi Zabad; Shadi Zabad (2025). Linkage-Disequilibrium (LD) matrices for six continental ancestry groups from the UK Biobank [Dataset]. http://doi.org/10.5281/zenodo.14614207
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jan 8, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Shadi Zabad; Shadi Zabad
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains Linkage Disequilibrium (LD) matrices for six ancestry groups from the UK Biobank.

    LD matrices record the SNP-by-SNP correlations in a given sample of individuals from the general population. In this case, we threshold the matrices so that we only record the correlations between variants in the same LD block (defined by LDetect). The continental ancestry groups are defined by the Pan-UKB initiative as:

    • EUR = European ancestry (N=362446)
    • CSA = Central/South Asian ancestry (N=8284)
    • AFR = African ancestry (N=6255)
    • EAS = East Asian ancestry (N=2700)
    • MID = Middle Eastern ancestry (N=1567)
    • AMR = Admixed American ancestry (N=987)

    The sample sizes here are restricted to unrelated individuals in the UK Biobank. The matrices were computed using magenpy and quantized to int8 data type for better compressibility. The standard matrices (EUR.tar.gz, AFR.tar.gz, ...) contain pairwise correlations for 1.4 million HapMap3+ variants. For European samples, we also provide LD matrices that record pairwise correlations for up to 18 million variants (EUR_18m_variants.tar.gz)

    For more details on how these matrices were computed, please consult our manuscript:

    Towards whole-genome inference of polygenic scores with fast and memory-efficient algorithms
    Shadi Zabad, Chirayu Anant Haryan, Simon Gravel, Sanchit Misra, Yue Li

    To access these matrices, consult the codebase of magenpy, our custom python package with special data structures for processing these LD matrices.

  20. f

    Variable mapped to malnutrition, frailty and sarcopenia.

    • plos.figshare.com
    xls
    Updated Jun 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nada AlMohaisen; Matthew Gittins; Chris Todd; Sorrel Burden (2023). Variable mapped to malnutrition, frailty and sarcopenia. [Dataset]. http://doi.org/10.1371/journal.pone.0278371.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 6, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Nada AlMohaisen; Matthew Gittins; Chris Todd; Sorrel Burden
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Variable mapped to malnutrition, frailty and sarcopenia.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Han Zhang (2023). Source Data 5. The dataset derived from the UK Biobank for the G-E interaction analysis. [Dataset]. http://doi.org/10.6084/m9.figshare.24154983.v2
Organization logo

Source Data 5. The dataset derived from the UK Biobank for the G-E interaction analysis.

Explore at:
application/gzipAvailable download formats
Dataset updated
Dec 12, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Han Zhang
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This data contains of the information on the mqtls of smoking-related methylation and is used to perform the G-E interaction analysis (for CD).

Search
Clear search
Close search
Google apps
Main menu