Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The identification of pathogenically-relevant genes and tissues for complex traits can be a difficult task. We developed an approach named genome-wide imputed differential expression enrichment (GIDEE), to prioritise trait-relevant tissues by combining genome-wide association study (GWAS) summary statistic data with tissue-specific expression quantitative trait loci (eQTL) data from 49 GTEx tissues. Our GIDEE approach analyses robustly imputed gene expression and tests for enrichment of differentially expressed genes in each tissue. Two tests (mean squared z-score and empirical Brown’s method) utilise the full distribution of differential expression p-values across all genes, while two binomial tests assess the proportion of genes with tissue-wide significant differential expression. GIDEE was applied to nine training datasets with known trait-relevant tissues and ranked 49 GTEx tissues using the individual and combined enrichment tests. The best-performing enrichment test produced an average rank of 1.55 out of 49 for the known trait-relevant tissue across the nine training datasets—ranking the correct tissue first five times, second three times, and third once. Subsequent application of the GIDEE approach to 20 test datasets—whose pathogenic tissues or cell types are uncertain or unknown—provided important prioritisation of tissues relevant to the trait’s regulatory architecture. GIDEE prioritisation may thus help identify both pathogenic tissues and suitable proxy tissue/cell models (e.g., using enriched tissues/cells that are more easily accessible). The application of our GIDEE approach to GWAS datasets will facilitate follow-up in silico and in vitro research to determine the functional consequence(s) of their risk loci.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Summary statistics data generated in Ferrari et al, 2014, Lancet Neurol (PMID: 24943344)The International FTD-Genetics Consortium (IFGC) shares the summary results data to allow other researchers to explore variants and/or loci for hypothesis driven work. The data provides information on ~ 6M markers and includes information about: marker – trait – allele 1 and 2 – OR or Beta – standard error – p-value, chromosome and bp position.To prevent identification of individuals, allele frequency data are not released.The data consists of the summary statistics generated during discovery phase (phase I) of the study including: bvFTD (n=1377 vs 2754 ctrls) AND/OR SD (n=308 vs 616 ctrls) AND/OR PNFA (n=269 vs 538 ctrls) AND/OR FTD-MND (n=200 vs 400 ctrls) AND/OR subtypes meta-analysis.Note:1. The IFGC requests to be included among co-authors in publications that might result from the use of this data as “The International FTD-Genetics Consortium (IFGC)” following Pubmed guidelines where Consortia or working group authors shall be listed on PubMed as collaborators rather than authors, where collaborator names are searchable on PubMed in the same way as authors. The acknowledgments associated with the IFGC as well as the IFGC members are provided as separate pdf document, together with the summary statistics;2. Publications (including but not limited to manuscripts, presentation, patent, grant) based on this IFGC’s dataset shall include the citation of the original work (Ferrari et al, 2014, Lancet Neurol, PMID: 24943344) and add the following to the acknowledgement section: “We thank the International FTD-Genetics Consortium (IFGC) for summary data”.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Supplementary Material
https://www.skyquestt.com/privacy/https://www.skyquestt.com/privacy/
U.S. Genetic Testing Market size was valued at USD 5.27 Billion in 2022 and is poised to grow from USD 5.6 billion in 2023 to USD 10.45 billion by 2031, growing at a CAGR of 7.9% in the forecast period (2024-2031).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Studying the impact of genetic variation on gene regulatory networks is essential to understand the biological mechanisms by which genetic variation causes variation in phenotypes. Bayesian networks provide an elegant statistical approach for multi-trait genetic mapping and modelling causal trait relationships. However, inferring Bayesian gene networks from high-dimensional genetics and genomics data is challenging, because the number of possible networks scales super-exponentially with the number of nodes, and the computational cost of conventional Bayesian network inference methods quickly becomes prohibitive. We propose an alternative method to infer high-quality Bayesian gene networks that easily scales to thousands of genes. Our method first reconstructs a node ordering by conducting pairwise causal inference tests between genes, which then allows to infer a Bayesian network via a series of independent variable selection problems, one for each gene. We demonstrate using simulated and real systems genetics data that this results in a Bayesian network with equal, and sometimes better, likelihood than the conventional methods, while having a significantly higher overlap with groundtruth networks and being orders of magnitude faster. Moreover our method allows for a unified false discovery rate control across genes and individual edges, and thus a rigorous and easily interpretable way for tuning the sparsity level of the inferred network. Bayesian network inference using pairwise node ordering is a highly efficient approach for reconstructing gene regulatory networks when prior information for the inclusion of edges exists or can be inferred from the available data.
https://www.skyquestt.com/privacy/https://www.skyquestt.com/privacy/
The Global Genetic Testing Market size was valued at USD 6.08 Billion in 2022 and is poised to grow from USD 7.42 Billion in 2023 to USD 36.40 Billion by 2031, at a CAGR of 22% over the forecast period (2024–2031).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset holds all non-GEO-hosted supplemental data files for manuscript "Beyond the reference: gene expression variation and transcriptional response to RNAi in C. elegans". Please see the linked preprint/publication for full details.
The PDF _guide_to_datafiles.pdf gives details on the format and content of each of the included files.
https://www.ontario.ca/page/open-government-licence-ontariohttps://www.ontario.ca/page/open-government-licence-ontario
This spatial data identifies breeding zones used by forest managers and forest genetic associations to manage provincial forest genetic assets.
The data:
shows the boundaries of breeding zones identifies the primary or target species within each zone
Species are associated with certain breeding programs, seed orchards and progeny (descendant) testing installations.
Additional Documentation
Forest Genetics Zone - Data Description (PDF) Forest Genetics Zone - Documentation (Word)
Status
Completed: production of the data has been completed
Maintenance and Update Frequency
As needed: data is updated as deemed necessary
Contact
Christine Kent, Integration Branch, christine.kent@ontario.ca
https://www.technavio.com/content/privacy-noticehttps://www.technavio.com/content/privacy-notice
Next Generation Sequencing Data Analysis Market Size 2024-2028
The global next generation sequencing data analysis market size is estimated to grow by USD 1.90 billion at a CAGR of 22.58% between 2023 and 2028. The market's growth hinges on several factors, including the escalating demand for personalized medicine, the increasing need for early diagnosis of genetic disorders, and the expanding applications in genomics research. Personalized medicine, tailored to individual genetic makeup, is gaining traction for its targeted and more effective treatment approach. The emphasis on early diagnosis of genetic disorders is driving the demand for advanced genetic testing technologies. Moreover, the broadening applications in genomics research, particularly in understanding genetic mechanisms and disease pathways, are fueling market expansion. These trends collectively highlight the growing significance of genetic testing and personalized medicine in healthcare, underscoring the market's growth trajectory.
What will be the Size of the Next Generation Sequencing Data Analysis Market During the Forecast Period?
To learn more about this report, Request Free Sample
Key Companies & Market Insights
Companies are implementing various strategies, such as strategic alliances, partnerships, mergers and acquisitions, geographical expansion, and product/service launches, to enhance their presence in the market. The report also includes detailed analyses of the competitive landscape of the market and information about key companies, including:
Agilent Technologies Inc., Alphabet Inc., BGI Genomics Co. Ltd., Bio Rad Laboratories Inc., Bionivid Technology Pvt. Ltd., Congenica Ltd., Corewell Health, DNAnexus Inc., DNASTAR Inc., Eurofins Scientific SE, F. Hoffmann La Roche Ltd., Fabric Genomics Inc., Golden Helix Inc., HiberCell Inc., Illumina Inc., Invitae Corp., Macrogen Inc., Oxford Nanopore Technologies plc, Pacific Biosciences of California Inc., Partek Inc., PierianDx Inc., QIAGEN NV, SciGenom Labs Pvt. Ltd., Takara Bio Inc., Thermo Fisher Scientific Inc., and Vela Diagnostics
Qualitative and quantitative analysis of companies has been conducted to help clients understand the wider business environment as well as the strengths and weaknesses of key market players. Data is qualitatively analyzed to categorize companies as pure play, category-focused, industry-focused, and diversified; it is quantitatively analyzed to categorize companies as dominant, leading, strong, tentative, and weak.
Market Segmentation
By End-user
The market share growth by the academic research segment will be significant during the forecast period. The market encompasses DNA sequencing technologies used in genomic science, academic research, and clinical diagnostics. Academic institutions utilize NGS for various applications, such as drug discovery, personalized medicine, and clinical diagnostics.
Get a glance at the market contribution of various segments Download PDF Sample
The academic research segment was valued at USD 221.1 million in 2018. Key drivers include decreasing sequencing costs, user-friendly software, and the demand for precision medicine. NGS enables the analysis of genomic patterns, epigenetics, and biological processes through sequence analysis tools and algorithms. Applications include oncology, genetic research, and tumor genotyping. NGS protocols aid in identifying somatic driver mutations, germline mutations, and resistance mutations. Cancer-related illnesses, financial irregularities, and healthcare professionals benefit from these tools, machine learning techniques, and cloud-based solutions. Additionally, NGS is applied in agriculture, forensics, and genomic studies. Key technologies include Whole-Genome Sequencing, array-based technologies, and clinical. Hence, these factors are expected to drive the market during the forecast period.
By Product
Services play an important role in the market, providing specialized expertise and support to users in analyzing and interpreting their NGS data. The market encompasses various services for Exome Sequencing, Targeted Resequencing, De Novo Sequencing, and Methyl Sequencing. Biotechnology and pharmaceutical companies, along with contract research organizations, utilize these services to analyze and interpret their NGS data. The process involves raw data preprocessing, alignment, variant calling, and annotation, employing advanced tools and algorithms. Service providers ensure accuracy and reliability through quality control measures and optimization of parameters. Technologies like Synthesis (SBS) are integral part. Hence, these factors are expected to drive the growth of the services segment in the market during the forecast period.
Regional Analysis
For more insights about the market share of various regions Download PDF Sample
North America is estimated to contribute 49% to the growth of the global mark
https://lio.maps.arcgis.com/sharing/rest/content/items/badb097e306b4d3b8becb3dba3ee5807/datahttps://lio.maps.arcgis.com/sharing/rest/content/items/badb097e306b4d3b8becb3dba3ee5807/data
This spatial data identifies the locations of areas designated for the study, conservation and improvement of tree species. It includes the location, name and established date for sites. It also identifies the primary or target tree species associated with the site. In combination with other silvicultural activities such as site preparation and fertilization, these areas support the overall yield, quality and sustainability of products from forest lands. They also contribute to the study and management of adaptive variation and genetic diversity.
Additional DocumentationForest Genetics Site - Data Description (PDF) Forest Genetics Site - Documentation (Word)
Status
On going: data is being continually updated
Maintenance and Update Frequency
As needed: data is updated as deemed necessary
Contact
Christine Kent, Integration Branch, christine.kent@ontario.ca
This dataset has features NDMNRF classifies as sensitive. Sensitive features are subject to licensing and approvals. Approval may be requested by contacting christine.kent@ontario.ca.
The data referenced here is licensed Electronic Intellectual Property of the Ontario Ministry of Natural Resources and Forestry and is provided for professional, non-commercial use only.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Fileset contents: 1. Exploring-Markets-of-Data-for-Personal-Health-Information-Questions.pdf – PDF document with survey questions 2. Exploring-Markets-of-Data-for-Personal-Health-Information-Responses.csv – Comma-delimited file with all response data 3. Exploring-Markets-of-Data-for-Personal-Health-Information-Responses-Data-Map.csv – Comma-delimited file to map response data codes 4. Exploring-Markets-of-Data-for-Personal-Health-Information-Analysis.xls – Microsoft Excel document with analysis data 5. Exploring-Markets-of-Data-for-Personal-Health-Information-Figures.pdf – PDF document with all figures 6. Exploring-Markets-of-Data-for-Personal-Health-Information-Paper.pdf – PDF document of research paper
Data availability This dataset can only be shared within Sweden due to legal restrictions.
Background The genomic and transcriptomic landscape of widely invasive follicular thyroid carcinomas (wiFTCs) is poorly characterized, and a large subset of these tumours lack information on credible genetic driver events. The aim of this study was to bridge this gap. Methods We performed whole-genome and RNA sequencing and subsequent bioinformatic analyses of 13 wiFTCs with a particularly poor prognosis, and matched normal tissue. Results Ten out of thirteen (77%) tumours exhibited one or several mutations in established genes ranked as the top 20 mutated in thyroid cancer, including TERT (n=4), NRAS (n=3), HRAS, KRAS, AKT, PTEN, PIK3CA, MUTYH and MEN1 (n=1 each). Recurrent somatic mutations in three genes were annotated as significant according to MutSig2CV: FAM72D (n=3), TP53 (n=3) and EIF1AX (n=3), with DGCR8 (n=2) as borderline significant. Of interest, both DGCR8 mutations were recurrent p.E518K missense alterations, a mutation known to cause familial multinodular goiter (MNG) via disruption of microRNA (miRNA) processing. Expression analyses pinpointed a trend towards reduced DGCR8 mRNA expression in FTCs in general. Copy number analyses revealed recurrent gains of loci on chromosomes 4, 6 and 10, and fusion gene analyses revealed 27 high-quality events. Based on the transcriptome data FTCs clustered in two principal clusters, displaying significant differences in expression of genes associated with metabolic pathways. Conclusion In summary, we describe the genomic and transcriptomic landscape in wiFTCs and identify novel recurrent mutations and copy number alterations with possible driver properties and lay the foundation for future studies.
The dataset consists of tables and lists containing underlying data, and supplementary figures for a manuscript submitted to "Journal of Clinical Endocrinology & Metabolism". It includes 8 tables and 3 figures:
File name: T1_Detailed-characteristics-of-the-study-cohort.csv Contains "Table 1: Detailed characteristics of the study cohort."
File name: T2_List-of-Somatic-SNVs.csv Contains "Table 2: List of Somatic SNV's (Small nucleotide variants)."
File name: T3_MutSig2CV-input-genes.csv Contains "Table 3: MutSig2CV input genes."
File name: T4_MutSig2CV-genes-ranked-by-p-value.csv Contains "Table 4: MutSig2CV genes ranked by p-value."
File name: T5_Genes-in-copy-number-altered-minimal-region-of-amplification.csv Contains "Table 5: List of genes in copy number altered minimal region of amplification."
File name: T6_Aberrant-cell-fraction-and-ploidy-as-determined-by-ASCAT.csv Contains "Table 6: Aberrant cell fraction and ploidy as determined by ASCAT."
File name: T7_High-confidence-structural-variations-in-the-tumor-cohort.csv Contains "Table 7: List of high-confidence structural variations in the tumor cohort."
File name: T8_Significant-differentially-expressed-genes-in-tumor-vs-normal-thyroid.csv Contains "Table 8: List of significant differentially expressed genes in tumor versus normal thyroid."
File name: List_of_variables.pdf Contains List of variables: Metadata and abbreviation explanations for Table 1-8.
File name: Whole-genome-sequencing-follicular-thyroid-carcinomas_Figures.pdf Contains Supplementary Figure S1-S3: - Supplementary Figure S1: Somatic mutational overview in the WGS cohort. - Supplementary Figure S2: Normalized DGCR8 mRNA expression in tumours with or without loss of heterozygosity (LOH) of the DGCR8 locus. - Supplementary Figure S3: a Gene set enrichment analysis (GSEA).
Environmental variables used in GEA tests for loci in Sweltsa with high genetic differentiation (FST). Land cover for 30m pixels were collected from the National Land Cover Database 2016 (https://www.mrlc.gov/data/nlcd-2016-land-cover-conus). Annual, spring (SpPrecip) and summer (SumPrecip) precipitation data come from Dayment (Thornton et al. 1997, 2012). Aspect (northness), slope and elevation data come from NED 10 m DEM (http://seamless.usgs.gov). Net primary productivity (NPP) come from https://www.ntsg.umt.edu/files/modis/MOD17UsersGuide2015_v3.pdf. Day to last snow cover (DLS) was calculated after determining the first snow free 8-day period (Modis 2017). Mean modeled June (STJune) and July (STJuly) stream temperature data were from mean monthly stream temperature predictions for baseline period 1986-2005 (see Jones et al. 2014, 100 m pixels). Daily minimum air temperatures (DMT) were collected from Daymet (Thornton et al. 1997, 2012) and mean degree days (DegDays) for the pe...
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Protein-Protein, Genetic, and Chemical Interactions for PDF (Mus musculus) curated by BioGRID (https://thebiogrid.org); DEFINITION: peptide deformylase (mitochondrial)
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Levels of sociability are continuously distributed in the general population, and decreased sociability represents an early manifestation of several brain disorders. Here, we investigated the genetic underpinnings of sociability in the population.Main question of our research: 1. Are there common genetic variants that are associated with sociability in the general population? 2. Are genetic variants that are associated with sociability also associated with neuropsychiatric disorders?Type of data uploaded in this repository:The UK Biobank project (see https://www.ukbiobank.ac.uk/) is a large-scale biomedical database and research resource, containing in-depth genetic and health information from half a million UK participants. The database is globally accessible to approved researchers undertaking vital research into the most common and life-threatening diseases. The raw data that this project is based on comes from the publically available UK Biobank set, which is very large and is therefore not provided here. Here we only provide the results from our analysis, that is also described here: https://www.biorxiv.org/content/10.1101/781195v2 and currently in revision in a scientific journal. In the dataset you will find the association of 9327396 genetic variants with the phenotype sociability. This dataset is not applicable to be opened with Excel, and can best be opened on a cluster computer or using specfic software.SubjectsThe UK Biobank (UKBB) is a major population-based cohort from the United Kingdom that includes individuals aged between 37 and 73 years. We constructed a sociability measure based on the the aggregation of scores per participant on four questions from the UKBB database that link to sociability, including (1) a question about the frequency of friend/family visits, (2) a question on the number and type of social venues that are visited, (3) a question about worrying after social embarrassment and (4) a question about feeling lonely, leading to a sociability score ranging from 0-4. Participants were excluded if they had somatic problems that could be related to social withdrawal (BMI < 15 or BMI > 40, narcolepsy (all the time), stroke, severe tinnitus, deafness or brain-related cancers) or if they answered that they had “No friends/family outside household” or “Do not know” or “Prefer not to answer” to any of the questions.SNP genotyping and quality controlDetails about the available genome-wide genotyping data for UKBB participants have been reported previously (PMID: 30305743). We used third-release genotyping data (see https://biobank.ctsu.ox.ac.uk/crystal/label.cgi?id=100319). Briefly, 49,950 participants were genotyped using the UK BiLEVE Axiom Array and 438,427 participants were genotyped using UK Biobank Axiom Array. Genotypes were imputed into the dataset using the Haplotype Reference Consortium (HRC), and the UK10K haplotype resource. To account for ethnicity, we included only those individuals that identified themselves as "white" by self-report and plotted the Principal Components (PC) provided by the UKBB, excluding individuals considered to be outliers according to PCs 1 and 2. Genetic relatedness calculated with KING kinship and provided by the UKBB (https://kenhanscombe.github.io/ukbtools/articles/explore-ukb-data.html ; http://www.ukbiobank.ac.uk/wp-content/uploads/2014/04/UKBiobank_genotyping_QC_documentation-web.pdf) was used to identify first and second-degree relatives. Subsequently ´families´ (i.e. clusters of related individuals above an IBD>0.125 threshold) were created and only one individual from each of these created ‘families’ was included in the analysis. If self-reported sex and SNP-based sex differed, individuals were excluded from further analysis. Single nucleotide polymorphisms (SNPs) with minor allele frequency <0.005, Hardy-Weinberg equilibrium test P value<1e−6, missing genotype rate >0.05, and imputation quality of INFO <0.8 were excluded. In the current study, all analyses are based on 342,461 participants of European ancestry for which both genotype data and sociability scores were available.Genome-wide association analysisGenome-wide association analysis with the imputed marker dosages was performed in PLINK1.9, using a linear regression model with the sociability measure as the dependent variable and including sex, age, 10 first PCs, assessment center, and genotype batch as covariates. SNPs were considered significantly associated if they had p-value < 5e-8. Associated loci were considered independent of each other at r2 0.6 and lead SNPs were classified as the SNP with the smallest association p-value and at r2 0.1, using a 250kb window.The summary statistics come from the plink2 linear regression analysis.
This record includes training materials associated with the Australian BioCommons workshop ‘Variant calling in humans, animals and plants with Galaxy’. This workshop took place on 25 May 2021. Variant calling in polyploid organisms, including humans, plants and animals, can help determine single or multi-variant contributors to a phenotype. Further, sexual reproduction (as compared to asexual) combines variants in a novel manner; this can be used to determine previously unknown variant - phenotype combinations but also to track lineage and lineage associated traits (GWAS studies), that all rely on highly accurate variant calling. The ability to confidently call variants in polyploid organisms is highly dependent on the balance between the frequency of variant observations against the background of non-variant observations, and even further compounded when one considers multi-variant positions within the genome. These are some of the challenges that will be explored in the workshop. In this online workshop we focused on the tools and workflows available for variant calling in polyploid organisms in Galaxy Australia. The workshop provided opportunities for hands-on experience using Freebayes for variant calling and SnpEff and GEMINI for variant annotation. The workshop made use of data from a case study on diagnosing a genetic disease however the tools and workflows are equally applicable to other polyploid organisms and biological questions. Access to all of the tools covered in this workshop was via Galaxy Australia, an online platform for biological research that allows people to use computational data analysis tools and workflows without the need for programming experience. The materials are shared under a Creative Commons 4.0 International agreement unless otherwise specified and were current at the time of the event. Files and materials included in this record: Event metadata (PDF): Information about the event including, description, event URL, learning objectives, prerequisites, technical requirements etc. Index of training materials (PDF): List and description of all materials associated with this event including the name, format, location and a brief description of each file. Schedule (PDF): schedule for the workshop Variant calling - humans, animals, plants - slides (PPTX and PDF): slides used in the workshop Materials shared elsewhere: The tutorial used in this workshop is available via the Galaxy Training Network. Wolfgang Maier, Bérénice Batut, Torsten Houwaart, Anika Erxleben, Björn Grüning, 2021 Exome sequencing data analysis for diagnosing a genetic disease (Galaxy Training Materials). https://training.galaxyproject.org/training-material/topics/variant-analysis/tutorials/exome-seq/tutorial.html Online; accessed 25 May 2021 {"references": ["Wolfgang Maier, B\u00e9r\u00e9nice Batut, Torsten Houwaart, Anika Erxleben, Bj\u00f6rn Gr\u00fcning, 2021 Exome sequencing data analysis for diagnosing a genetic disease (Galaxy Training Materials). https://training.galaxyproject.org/training-material/topics/variant-analysis/tutorials/exome-seq/tutorial.html Online; accessed 25 May 2021", "Wellcome Genome Campus, 2021. Achievements and uniqueness. https://www.wellcomegenomecampus.org/scienceandinnovation/achievements-uniqueness/ Online; accessed 25 May 2021.", "Miller, N.A., Farrow, E.G., Gibson, M. et al. A 26-hour system of highly sensitive whole genome sequencing for emergency management of genetic diseases. Genome Med 7, 100 (2015). https://doi.org/10.1186/s13073-015-0221-8", "Illumina. 2017. An introduction to Next-Generation Sequencing Technology. https://www.illumina.com/content/dam/illumina-marketing/documents/products/illumina_sequencing_introduction.pdf Online; accessed 25 May 2021.", "Mardis, E. R. 2012. Applying next-generation sequencing to pancreatic cancer treatment. Nat. Rev. Gastroenterol. Hepatol. doi:10.1038/nrgastro.2012.126", "Jose Carbonell-Caballero, Roberto Alonso, Victoria Iba\u00f1ez, Javier Terol, Manuel Talon, Joaquin Dopazo, A Phylogenetic Analysis of 34 Chloroplast Genomes Elucidates the Relationships between Wild and Domestic Species within the Genus Citrus, Molecular Biology and Evolution, Volume 32, Issue 8, August 2015, Pages 2015\u20132035, https://doi.org/10.1093/molbev/msv082", "Freebayes 2021. freebayes, a haplotype-based variant detector: user manual and guide. Version 13 February 2021. https://github.com/freebayes/freebayes/blob/master/README.md Online, accessed 25 May 2021", "Kulawiec, M., Owens, K. & Singh, K. 2009. mtDNA G10398A variant in African-American women with breast cancer provides resistance to apoptosis and promotes metastasis in mice. J Hum Genet 54, 647\u2013654. https://doi.org/10.1038/jhg.2009.89", "Wikipedia. 2021. Pile up format. https://en.wikipedia.org/wiki/Pileup_format Online, accessed 25 May 2021.", "The Pevsner Laboratory. SNP Trio. http://pevsnerlab.kennedykrieger.org/php/?q=node/85 Online, accessed 25 May 2021.", "Richards, S., Aziz, ...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Genome-wide association studies (GWAS) have successfully identified a large number of genetic variants associated with traits and diseases. However, it still remains challenging to fully understand the functional mechanisms underlying many associated variants. This is especially the case when we are interested in variants shared across multiple phenotypes. To address this challenge, we propose graph-GPA 2.0 (GGPA 2.0), a statistical framework to integrate GWAS datasets for multiple phenotypes and incorporate functional annotations within a unified framework. Our simulation studies showed that incorporating functional annotation data using GGPA 2.0 not only improves the detection of disease-associated variants, but also provides a more accurate estimation of relationships among diseases. Next, we analyzed five autoimmune diseases and five psychiatric disorders with the functional annotations derived from GenoSkyline and GenoSkyline-Plus, along with the prior disease graph generated by biomedical literature mining. For autoimmune diseases, GGPA 2.0 identified enrichment for blood-related epigenetic marks, especially B cells and regulatory T cells, across multiple diseases. Psychiatric disorders were enriched for brain-related epigenetic marks, especially the prefrontal cortex and the inferior temporal lobe for bipolar disorder and schizophrenia, respectively. In addition, the pleiotropy between bipolar disorder and schizophrenia was also detected. Finally, we found that GGPA 2.0 is robust to the use of irrelevant and/or incorrect functional annotations. These results demonstrate that GGPA 2.0 can be a powerful tool to identify genetic variants associated with each phenotype or those shared across multiple phenotypes, while also promoting an understanding of functional mechanisms underlying the associated variants.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Protein-Protein, Genetic, and Chemical Interactions for PDF-1 (Caenorhabditis elegans) curated by BioGRID (https://thebiogrid.org); DEFINITION: Protein PDF-1
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset for Rapid polygenic adaptation in a wild population of ash trees under a novel fungal epidemic
Code used for plotting of main figures and quantifying allelic shifts attached in the GitHub repository CareyMetheringham/MardenPark. Additional files for analysis of GEBV shifts between adults and juveniles, estimation of heritability, trends in green up and simulations of allelic shifts included as:
GEBV-regression.Rmd
heritabilityEst.R
GreeningAnalysis.R
Simulated_selection_analysis.Rmd
Distinguishing the effects of selection from genetic drift.pdf
Data files:
S1 - phenotypic measurements for adult and juvenile trees
S2 - Effect sizes for SNPs used to calculate GEBV
S3 - Allelic frequencies of sites used to calculate GEBV
related_trees.csv - Predicted parentage of trees
gebv_model_df.csv - Data used for greenup calculations in GreeningAnalysis.R
maf1.pass2.miss25.snps.only.LD.vcf - Filtered file of high MAF SNPs used for parentage estimation
unlinked_sites.csv - unlinked sites used in GEBV-regression.Rmd
ebv_table_10000_250 - table of estimated breeding values in field trial populatio, used for heritability estimation
MP_eefects_MIA_and_MAA.csv - Data for plotting Figure 3 - Estimated effect size of the major and minor allele, plus standard error on the estimate
Anthropogentic disturbance is known to affect population sizes and genetic population structure of many biotas. Wildfires are a major disturbance in many regions of the world, particularly in Mediterranean regions and on the Atlantic islands. Populations of many insects, such as the Madeiran Green Bush-Cricket (Psalmatophanes barretoi), are threated by wildfies. However, the effects of wildfires on genetic structure and diversity as well as morphological variation of the populations reamins little understood. Therefore, we studied genetic diversity, structure, and potential bottlenecks of this species using microsatellites. We also studied morphological variation and fluctuating asymmetry within and between populations to unravel potential effects of wildfires. We did not find any evidence for genetic differentiation of populations, but some populations had high heterozygosity excess, regardless of burning. Morphological variation in burnt areas was lower than in non-burnt areas. Fluctu..., Geography of the sampling and studied regions We sampled individuals of P. barretoi from the known populations of this species (Rhee et al., 2023) in burnt and unburnt areas across the whole island of Madeira (Figure 2). Data on wildfires between 2006 and 2019 were obtained from the Institute of Forests and Nature Conservation in Madeira (IFCN). We divided the sampling localities into three different categories according to their fire severities (“Unburnt†, “Partially burnt†and “Completely Burnt Regions†) based on their overlap with the fire polygons visually for genetic analysis (Figure 2). The study sites Machico and Santana, Seixal and Fanal did not experience any recent wildfires (“Unburnt Regions†), the study site Ribeira da Vacca and Amparo had wildfires (“Burnt Regions†), the sites Serra de Agua and Vagum, and Paul da Serra burnt partially (“Partially Burnt Regions†). We also assigned each individual to the fire history of its locality in two categories (burnt or not) based on t..., , # Data from: Wildfires induce a reduction in body size and morphological variation of an insular endemic insect
https://doi.org/10.5061/dryad.sj3tx96f2
These are the data from Rhee et al. (2025).
Other information, including the software used, is available in detail in Rhee et al. (2025). The supplemental figures (Supplemental figures (Revised).pdf) contain box plots about the relationship between burn status and morphological traits, and the graphical results of STRUCTURE analysis. The supplemental tables (Supplemental table (Revised).pdf) include tables about information of microsatellite markers and the ANOVA results of the relationship between burnt status and morphological traits.
Rhee, H., Naber, S., Krehenwinkel, H. & Hochkirch, A. (2025) Wildfires induce a reduction in body size and morphological variation of an insular endemic insect. Ecological Entomology, 1–10. Available from: [https...,
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The identification of pathogenically-relevant genes and tissues for complex traits can be a difficult task. We developed an approach named genome-wide imputed differential expression enrichment (GIDEE), to prioritise trait-relevant tissues by combining genome-wide association study (GWAS) summary statistic data with tissue-specific expression quantitative trait loci (eQTL) data from 49 GTEx tissues. Our GIDEE approach analyses robustly imputed gene expression and tests for enrichment of differentially expressed genes in each tissue. Two tests (mean squared z-score and empirical Brown’s method) utilise the full distribution of differential expression p-values across all genes, while two binomial tests assess the proportion of genes with tissue-wide significant differential expression. GIDEE was applied to nine training datasets with known trait-relevant tissues and ranked 49 GTEx tissues using the individual and combined enrichment tests. The best-performing enrichment test produced an average rank of 1.55 out of 49 for the known trait-relevant tissue across the nine training datasets—ranking the correct tissue first five times, second three times, and third once. Subsequent application of the GIDEE approach to 20 test datasets—whose pathogenic tissues or cell types are uncertain or unknown—provided important prioritisation of tissues relevant to the trait’s regulatory architecture. GIDEE prioritisation may thus help identify both pathogenic tissues and suitable proxy tissue/cell models (e.g., using enriched tissues/cells that are more easily accessible). The application of our GIDEE approach to GWAS datasets will facilitate follow-up in silico and in vitro research to determine the functional consequence(s) of their risk loci.