93 datasets found
  1. n

    Genome Aggregation Database

    • neuinfo.org
    • scicrunch.org
    • +2more
    Updated Jul 19, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). Genome Aggregation Database [Dataset]. http://identifiers.org/RRID:SCR_014964
    Explore at:
    Dataset updated
    Jul 19, 2018
    Description

    Database that aggregates exome and genome sequencing data from large-scale sequencing projects. The gnomAD data set contains individuals sequenced using multiple exome capture methods and sequencing chemistries. Raw data from the projects have been reprocessed through the same pipeline, and jointly variant-called to increase consistency across projects.

  2. b

    Genome Aggregation Database

    • bioregistry.io
    Updated Dec 19, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Genome Aggregation Database [Dataset]. https://bioregistry.io/gnomad
    Explore at:
    Dataset updated
    Dec 19, 2022
    License

    https://bioregistry.io/spdx:CC0-1.0https://bioregistry.io/spdx:CC0-1.0

    Description

    The Genome Aggregation Database (gnomAD) is a resource developed by an international coalition of investigators, with the goal of aggregating and harmonizing both exome and genome sequencing data from a wide variety of large-scale sequencing projects, and making summary data available for the wider scientific community (from https://gnomad.broadinstitute.org).

  3. Genome Aggregation Database (gnomAD) - Data Lakehouse Ready

    • registry.opendata.aws
    Updated Sep 13, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amazon Web Services (2021). Genome Aggregation Database (gnomAD) - Data Lakehouse Ready [Dataset]. https://registry.opendata.aws/gnomad-data-lakehouse-ready/
    Explore at:
    Dataset updated
    Sep 13, 2021
    Dataset provided by
    Amazon Web Serviceshttp://aws.amazon.com/
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    The Genome Aggregation Database (gnomAD) is a resource developed by an international coalition of investigators that aggregates and harmonizes both exome and genome data from a wide range of large-scale human sequencing projects Sign up for the gnomAD mailing list here. This dataset was derived from summary data from gnomAD release 3.1, available on the Registry of Open Data on AWS for ready enrollment into the Data Lake as Code.

  4. A 的基因组

    • figshare.com
    txt
    Updated Apr 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    lihao Guo (2025). A 的基因组 [Dataset]. http://doi.org/10.6084/m9.figshare.28828337.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Apr 19, 2025
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    lihao Guo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    基因组组装

  5. gnomAD

    • console.cloud.google.com
    Updated Jul 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    https://console.cloud.google.com/marketplace/browse?filter=partner:Broad%20Institute%20of%20MIT%20and%20Harvard&hl=zh_TW (2023). gnomAD [Dataset]. https://console.cloud.google.com/marketplace/product/broad-institute/gnomad?hl=zh_TW
    Explore at:
    Dataset updated
    Jul 25, 2023
    Dataset provided by
    Googlehttp://google.com/
    Description

    The Genome Aggregation Database (gnomAD) is maintained by an international coalition of investigators to aggregate and harmonize data from large-scale sequencing projects. These public datasets are available in VCF format in Google Cloud Storage and in Google BigQuery as integer range partitioned tables . Each dataset is sharded by chromosome meaning variants are distributed across 24 tables (indicated with “_chr*” suffix). Utilizing the sharded tables reduces query costs significantly. Variant Transforms was used to process these VCF files and import them to BigQuery. VEP annotations were parsed into separate columns for easier analysis using Variant Transforms’ annotation support . These public datasets are included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. Use this quick start guide to quickly learn how to access public datasets on Google Cloud Storage. Find out more in our blog post, Providing open access to gnomAD on Google Cloud . Questions? Contact gcp-life-sciences-discuss@googlegroups.com.

  6. European ancestry mutational burden analysis variants.

    • figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luke Hebert; Paul Hillman; Craig Baker; Michael Brown; Allison Ashley-Koch; James E. Hixson; Alanna C. Morrison; Hope Northrup; Kit Sing Au (2023). European ancestry mutational burden analysis variants. [Dataset]. http://doi.org/10.1371/journal.pone.0239083.s006
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Luke Hebert; Paul Hillman; Craig Baker; Michael Brown; Allison Ashley-Koch; James E. Hixson; Alanna C. Morrison; Hope Northrup; Kit Sing Au
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Variant data for those variants used in the European ancestry mutational burden analysis which fall within nominally significant disrupted genes. (XLS)

  7. Hispanic mutational burden analysis variants.

    • figshare.com
    xls
    Updated Jun 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luke Hebert; Paul Hillman; Craig Baker; Michael Brown; Allison Ashley-Koch; James E. Hixson; Alanna C. Morrison; Hope Northrup; Kit Sing Au (2023). Hispanic mutational burden analysis variants. [Dataset]. http://doi.org/10.1371/journal.pone.0239083.s005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 14, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Luke Hebert; Paul Hillman; Craig Baker; Michael Brown; Allison Ashley-Koch; James E. Hixson; Alanna C. Morrison; Hope Northrup; Kit Sing Au
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Variant data for those variants used in the Hispanic mutational burden analysis which fall within nominally significant disrupted genes. (XLS)

  8. Gene Ontology (GO) enrichment analysis.

    • figshare.com
    xls
    Updated May 31, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luke Hebert; Paul Hillman; Craig Baker; Michael Brown; Allison Ashley-Koch; James E. Hixson; Alanna C. Morrison; Hope Northrup; Kit Sing Au (2023). Gene Ontology (GO) enrichment analysis. [Dataset]. http://doi.org/10.1371/journal.pone.0239083.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Luke Hebert; Paul Hillman; Craig Baker; Michael Brown; Allison Ashley-Koch; James E. Hixson; Alanna C. Morrison; Hope Northrup; Kit Sing Au
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Gene Ontology (GO) enrichment analysis.

  9. n

    ExAc

    • neuinfo.org
    • scicrunch.org
    • +2more
    Updated Oct 21, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2014). ExAc [Dataset]. http://identifiers.org/RRID:SCR_004068
    Explore at:
    Dataset updated
    Oct 21, 2014
    Description

    THIS RESOURCE IS NO LONGER IN SERVICE. Documented on January 9, 2023. An aggregated data platform for genome sequencing data created by a coalition of investigators seeking to aggregate and harmonize exome sequencing data from a variety of large-scale sequencing projects, and to make summary data available for the wider scientific community. The data set provided on this website spans 61,486 unrelated individuals sequenced as part of various disease-specific and population genetic studies. They have removed individuals affected by severe pediatric disease, so this data set should serve as a useful reference set of allele frequencies for severe disease studies. All of the raw data from these projects have been reprocessed through the same pipeline, and jointly variant-called to increase consistency across projects. They ask that you not publish global (genome-wide) analyses of these data until after the ExAC flagship paper has been published, estimated to be in early 2015. If you''re uncertain which category your analyses fall into, please email them. The aggregation and release of summary data from the exomes collected by the Exome Aggregation Consortium has been approved by the Partners IRB (protocol 2013P001477, Genomic approaches to gene discovery in rare neuromuscular diseases).

  10. RDV gene combinations.

    • plos.figshare.com
    xlsx
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luke Hebert; Paul Hillman; Craig Baker; Michael Brown; Allison Ashley-Koch; James E. Hixson; Alanna C. Morrison; Hope Northrup; Kit Sing Au (2023). RDV gene combinations. [Dataset]. http://doi.org/10.1371/journal.pone.0239083.s007
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Luke Hebert; Paul Hillman; Craig Baker; Michael Brown; Allison Ashley-Koch; James E. Hixson; Alanna C. Morrison; Hope Northrup; Kit Sing Au
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A list of nominally significant disrupted gene combinations that contained RDVs in the same sample. (XLSX)

  11. f

    Data of research.

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Aug 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    de Melo Neto, João Simão; dos Santos, Gabriela Sepêda; Navegante, Aline Carvalho Gonçalves; Luz, Marlucia Oliveira; Júnior, José Ribamar Leal Dias; Pierre, Marie Esther; dos Reis Galhardo, Deizyane; de Campos Gomes, Fabiana; Dias, Helana Augusta Andrade Leal (2024). Data of research. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001281411
    Explore at:
    Dataset updated
    Aug 29, 2024
    Authors
    de Melo Neto, João Simão; dos Santos, Gabriela Sepêda; Navegante, Aline Carvalho Gonçalves; Luz, Marlucia Oliveira; Júnior, José Ribamar Leal Dias; Pierre, Marie Esther; dos Reis Galhardo, Deizyane; de Campos Gomes, Fabiana; Dias, Helana Augusta Andrade Leal
    Description

    IntroductionGenetic variants may influence Toll-like receptor (TLR) signaling in the immune response to human papillomavirus (HPV) infection and lead to cervical cancer. In this study, we investigated the pattern of TLR expression in the transcriptome of HPV-positive and HPV-negative cervical cancer samples and looked for variants potentially related to TLR gene alterations in exomes from different populations.Materials and methodsA cervical tissue sample from 28 women, which was obtained from the Gene Expression Omnibus database, was used to examine TLR gene expression. Subsequently, the transcripts related to the TLRs that showed significant gene expression were queried in the Genome Aggregation Database to search for variants in more than 5,728 exomes from different ethnicities.ResultsCancer and HPV were found to be associated (p<0.0001). TLR1(p = 0.001), TLR3(p = 0.004), TLR4(221060_s_at)(p = 0.001), TLR7(p = 0.001;p = 0.047), TLR8(p = 0.002) and TLR10(p = 0.008) were negatively regulated, while TLR4(1552798_at)(p<0.0001) and TLR6(p = 0.019) were positively regulated in HPV-positive patients (p<0.05). The clinical significance of the variants was statistically significant for TLR1, TLR3, TLR6 and TLR8 in association with ethnicity. Genetic variants in different TLRs have been found in various ethnic populations. Variants of the TLR gene were of the following types: TLR1(5_prime_UTR), TLR4(start_lost), TLR8(synonymous;missense) and TLR10(3_prime_UTR). The “missense” variant was found to have a risk of its clinical significance being pathogenic in South Asian populations (OR = 56,820[95%CI:40,206,80,299]).ConclusionThe results of this study suggest that the variants found in the transcriptomes of different populations may lead to impairment of the functional aspect of TLRs that show significant gene expression in cervical cancer samples caused by HPV.

  12. List of 20,038 human-specific residues from 7052 different proteins, and the...

    • plos.figshare.com
    xls
    Updated Jul 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pablo Mier; Miguel A. Andrade-Navarro; Enrique Morett (2025). List of 20,038 human-specific residues from 7052 different proteins, and the amino acid in the rest of the non-human primates. [Dataset]. http://doi.org/10.1371/journal.pone.0328504.s003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jul 31, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Pablo Mier; Miguel A. Andrade-Navarro; Enrique Morett
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    List of 20,038 human-specific residues from 7052 different proteins, and the amino acid in the rest of the non-human primates.

  13. n

    Data from: Genome-scale target capture of mitochondrial and nuclear...

    • data.niaid.nih.gov
    • search.dataone.org
    • +1more
    zip
    Updated Dec 11, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mads Reinholdt Jensen; Philip Francis Thomsen (2020). Genome-scale target capture of mitochondrial and nuclear environmental DNA from water samples [Dataset]. http://doi.org/10.5061/dryad.4mw6m9086
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 11, 2020
    Dataset provided by
    Aarhus University
    Authors
    Mads Reinholdt Jensen; Philip Francis Thomsen
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Environmental DNA (eDNA) provides a promising supplement to traditional sampling methods for population genetic inferences, but current studies have almost entirely focused on short mitochondrial markers. Here, we develop one mitochondrial and one nuclear set of target capture probes for the whale shark (Rhincodon typus) and test them on seawater samples collected in Qatar to investigate the potential of target capture for eDNA-based population studies. The mitochondrial target capture successfully retrieved ~235x (90x-352x per base position) coverage of the whale shark mitogenome. Using a minor allele frequency of 5%, we find 29 variable sites throughout the mitogenome, indicative of at least five contributing individuals. We also retrieved numerous mitochondrial reads from an abundant non-target species mackerel tuna (Euthynnus affinis), showing a clear relation between sequence similarity to the capture probes and the number of captured reads. The nuclear target capture probes retrieved only few reads and polymorphic variants from the whale shark, but we successfully obtained millions of reads and thousands of polymorphic variants with different allele frequencies from E. affinis. We demonstrate that target capture of complete mitochondrial genomes and thousands of nuclear loci is possible from aquatic eDNA samples. Our results highlight that careful probe design, taking into account the range of divergence between target and non-target sequences as well as presence of non-target species at the sampling site, is crucial to consider. Environmental DNA sampling coupled with target capture approaches provide an efficient means with which to retrieve population genomic data from aggregating and spawning aquatic species.

    Methods This data is the raw sequencing output for a mitochondrial capture and a nuclear capture using custom-made myBaits target capture. Both captures are based on a single sample (two one-liter eDNA samples combined after extraction) collected in Qatari waters in the middle of a whale shark aggregation, in an area with an expected high abundance of Euthynnus affinis. There is thus no need for demultiplexing.

    The sequencing was performed on a MiSeq with 301 bp PE sequencing.

  14. Projected AAMD sample sizes required to test for significant association...

    • plos.figshare.com
    xls
    Updated Jun 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amy V. Jones; Darin Curtiss; Claire Harris; Tom Southerington; Marco Hautalahti; Pauli Wihuri; Johanna Mäkelä; Roosa E. Kallionpää; Enni Makkonen; Theresa Knopp; Arto Mannermaa; Erna Mäkinen; Anne-Mari Moilanen; Tongalp H. Tezel; Nadia K. Waheed (2023). Projected AAMD sample sizes required to test for significant association between Type 1 CFI rare variants and disease in different ethnicities defined by gnomAD. [Dataset]. http://doi.org/10.1371/journal.pone.0272260.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 13, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Amy V. Jones; Darin Curtiss; Claire Harris; Tom Southerington; Marco Hautalahti; Pauli Wihuri; Johanna Mäkelä; Roosa E. Kallionpää; Enni Makkonen; Theresa Knopp; Arto Mannermaa; Erna Mäkinen; Anne-Mari Moilanen; Tongalp H. Tezel; Nadia K. Waheed
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Projected AAMD sample sizes required to test for significant association between Type 1 CFI rare variants and disease in different ethnicities defined by gnomAD.

  15. n

    BioSample Database at EBI

    • neuinfo.org
    • scicrunch.org
    • +2more
    Updated Jan 29, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). BioSample Database at EBI [Dataset]. http://identifiers.org/RRID:SCR_004856
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    Database that aggregates sample information for reference samples (e.g. Coriell Cell lines) and samples for which data exist in one of the EBI''''s assay databases such as ArrayExpress, the European Nucleotide Archive or PRoteomics Identificates DatabasE. It provides links to assays for specific samples, and accepts direct submissions of sample information. The goals of the BioSample Database include: # recording and linking of sample information consistently within EBI databases such as ENA, ArrayExpress and PRIDE; # minimizing data entry efforts for EBI database submitters by enabling submitting sample descriptions once and referencing them later in data submissions to assay databases and # supporting cross database queries by sample characteristics. The database includes a growing set of reference samples, such as cell lines, which are repeatedly used in experiments and can be easily referenced from any database by their accession numbers. Accession numbers for the reference samples will be exchanged with a similar database at NCBI. The samples in the database can be queried by their attributes, such as sample types, disease names or sample providers. A simple tab-delimited format facilitates submissions of sample information to the database, initially via email to biosamples (at) ebi.ac.uk. Current data sources: * European Nucleotide Archive (424,811 samples) * PRIDE (17,001 samples) * ArrayExpress (1,187,884 samples) * ENCODE cell lines (119 samples) * CORIELL cell lines (27,002 samples) * Thousand Genome (2,628 samples) * HapMap (1,417 samples) * IMSR (248,660 samples)

  16. b

    ExAC Gene

    • bioregistry.io
    Updated Dec 18, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). ExAC Gene [Dataset]. https://bioregistry.io/registry/exac.gene
    Explore at:
    Dataset updated
    Dec 18, 2021
    Description

    The Exome Aggregation Consortium (ExAC) is a coalition of investigators seeking to aggregate and harmonize exome sequencing data from a variety of large-scale sequencing projects, and to make summary data available for the wider scientific community. The data pertains to unrelated individuals sequenced as part of various disease-specific and population genetic studies and serves as a reference set of allele frequencies for severe disease studies. This collection references gene information.

  17. n

    Bioinformatics Links Directory

    • neuinfo.org
    • scicrunch.org
    • +3more
    Updated Jan 29, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Bioinformatics Links Directory [Dataset]. http://identifiers.org/RRID:SCR_008018
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    Database of curated links to molecular resources, tools and databases selected on the basis of recommendations from bioinformatics experts in the field. This resource relies on input from its community of bioinformatics users for suggestions. Starting in 2003, it has also started listing all links contained in the NAR Webserver issue. The different types of information available in this portal: * Computer Related: This category contains links to resources relating to programming languages often used in bioinformatics. Other tools of the trade, such as web development and database resources, are also included here. * Sequence Comparison: Tools and resources for the comparison of sequences including sequence similarity searching, alignment tools, and general comparative genomics resources. * DNA: This category contains links to useful resources for DNA sequence analyses such as tools for comparative sequence analysis and sequence assembly. Links to programs for sequence manipulation, primer design, and sequence retrieval and submission are also listed here. * Education: Links to information about the techniques, materials, people, places, and events of the greater bioinformatics community. Included are current news headlines, literature sources, educational material and links to bioinformatics courses and workshops. * Expression: Links to tools for predicting the expression, alternative splicing, and regulation of a gene sequence are found here. This section also contains links to databases, methods, and analysis tools for protein expression, SAGE, EST, and microarray data. * Human Genome: This section contains links to draft annotations of the human genome in addition to resources for sequence polymorphisms and genomics. Also included are links related to ethical discussions surrounding the study of the human genome. * Literature: Links to resources related to published literature, including tools to search for articles and through literature abstracts. Additional text mining resources, open access resources, and literature goldmines are also listed. * Model Organisms: Included in this category are links to resources for various model organisms ranging from mammals to microbes. These include databases and tools for genome scale analyses. * Other Molecules: Bioinformatics tools related to molecules other than DNA, RNA, and protein. This category will include resources for the bioinformatics of small molecules as well as for other biopolymers including carbohydrates and metabolites. * Protein: This category contains links to useful resources for protein sequence and structure analyses. Resources for phylogenetic analyses, prediction of protein features, and analyses of interactions are also found here. * RNA: Resources include links to sequence retrieval programs, structure prediction and visualization tools, motif search programs, and information on various functional RNAs.

  18. f

    Table_1_Clinical Variability of SYNJ1-Associated Early-Onset...

    • datasetcatalog.nlm.nih.gov
    • frontiersin.figshare.com
    Updated Mar 25, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lesage, Suzanne; Tesson, Christelle; Arezki, Mohamed; Corvol, Jean-Christophe; Benmahdjoub, Mustapha; Kesraoui, Selma; Brice, Alexis; Bertrand, Hélène; Mangone, Graziella; Singleton, Andrew (2021). Table_1_Clinical Variability of SYNJ1-Associated Early-Onset Parkinsonism.DOCX [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000862443
    Explore at:
    Dataset updated
    Mar 25, 2021
    Authors
    Lesage, Suzanne; Tesson, Christelle; Arezki, Mohamed; Corvol, Jean-Christophe; Benmahdjoub, Mustapha; Kesraoui, Selma; Brice, Alexis; Bertrand, Hélène; Mangone, Graziella; Singleton, Andrew
    Description

    Autosomal recessive early-onset parkinsonism is clinically and genetically heterogeneous. Mutations of three genes, PRKN, PINK1, and DJ-1 cause pure phenotypes usually characterized by levodopa-responsive Parkinson's disease. By contrast, mutations of other genes, including ATP13A2, PLA2G6, FBXO7, DNAJC6, SYNJ1, VPS13C, and PTRHD1, cause rarer, more severe diseases with a poor response to levodopa, generally with additional atypical features. We performed data mining on a gene panel or whole-exome sequencing in 460 index cases with early-onset (≤ 40 years) Parkinson's disease, including 57 with autosomal recessive disease and 403 isolated cases. We identified two isolated cases carrying biallelic mutations of SYNJ1 (double-heterozygous p.D791fs/p.Y232H and homozygous p. Y832C mutations) and two siblings with the recurrent homozygous p.R258Q mutation. All four variants were absent or rare in the Genome Aggregation Database, were predicted to be deleterious on in silico analysis and were found to be highly conserved between species. The patient with both the previously unknown p.D791fs and p.Y232H mutations presented with dystonia-parkinsonism accompanied by a frontal syndrome and oculomotor disturbances at the age of 39. In addition, two siblings from an Algerian consanguineous family carried the homozygous p.R258Q mutation and presented generalized tonic-clonic seizures during childhood, with severe intellectual disability, followed by progressive parkinsonism during their teens. By contrast, the isolated patient with the homozygous p. Y832C mutation, diagnosed at the age of 20, had typical parkinsonism, with no atypical symptoms and slow disease progression. Our findings expand the mutational spectrum and phenotypic profile of SYNJ1-related parkinsonism.

  19. Aggregation of recount3 RNA-seq data improves inference of consensus and...

    • zenodo.org
    bin, zip
    Updated Aug 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Prashanthi Ravichandran; Prashanthi Ravichandran (2024). Aggregation of recount3 RNA-seq data improves inference of consensus and tissue-specific gene co-expression networks [Dataset]. http://doi.org/10.5281/zenodo.10480999
    Explore at:
    bin, zipAvailable download formats
    Dataset updated
    Aug 30, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Prashanthi Ravichandran; Prashanthi Ravichandran
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data and Inferred Networks accompanying the manuscript entitled - “Aggregation of recount3 RNA-seq data improves the inference of consensus and context-specific gene co-expression networks”

    Authors: Prashanthi Ravichandran, Princy Parsana, Rebecca Keener, Kaspar Hansen, Alexis Battle

    Affiliations: Johns Hopkins University School of Medicine, Johns Hopkins University Department of Computer Science, Johns Hopkins University Bloomberg School of Public Health

    Description:

    This folder includes data produced in the analysis contained in the manuscript and inferred consensus and context-specific networks from graphical lasso and WGCNA with varying numbers of edges. Contents include:

    • all_metadata.rds: File including meta-data columns of study accession ID, sample ID, assigned tissue category, cancer status and disease status obtained through manual curation for the 95,484 RNA-seq samples used in the study.

    • all_counts.rds: log2 transformed RPKM normalized read counts for 5999 genes and 95,484 RNA-seq samples which was utilized for dimensionality reduction and data exploration

    • precision_matrices.zip: Zipped folder including networks inferred by graphical lasso for different experiments presented in the paper using weighted covariance aggregation following PC correction.

      • The networks can be found as follows. First, select the folder corresponding to the network of interest - for example, Blood, this will then include two or more folders which indicate the data aggregation utilized, select the folder corresponding appropriate level of data aggregation - either all samples/ GTEx for blood-specific networks, this includes precision matrices inferred across a range of penalization parameters. To view the precision matrix inferred for a particular value of the penalization parameter X, select the file labeled lambda_X.rds

      • For select networks, we have included the computed centrality measures which can be accessed at centrality_X.rds for a particular value of the penalization parameter X.

      • We have also included .rds files that list the hub genes from the consensus networks inferred from non-cancerous samples at “normal_hubs.rds”, and the consensus networks inferred from cancerous samples at “cancer_hubs.rds”

      • The file “context_specific_selected_networks.csv” includes the networks that were selected for downstream biological interpretation based on the scale-free criterion which is also summarized in the Supplementary Tables.

    • WGCNA.zip: A zipped folder containing gene modules inferred from WGCNA for sequentially aggregated GTEx, SRA, and blood studies. Select the data aggregated, and the number of studies based on folder names. For example, blood networks inferred from 20 studies can be accessed at blood/consensus/net_20. The individual networks correspond to distinct cut heights, and include information on the cut height used, the genes that the network was inferred over merged module labels, and merged module colors.

  20. n

    Sociability GWAS in a population-based sample : summary statistics of a...

    • narcis.nl
    • lifesciences.datastations.nl
    pdf
    Updated Mar 12, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bralten, J.B. (Radboud University); Roth Mota, N. (Radboud University); Klemann, C.J.H.M. (Radboud University); Witte, W. de (2021). Sociability GWAS in a population-based sample : summary statistics of a genome-wide association study of an aggregated sociability score in the UK Biobank [Dataset]. http://doi.org/10.17026/dans-ztj-zga6
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Mar 12, 2021
    Dataset provided by
    Data Archiving and Networked Services (DANS)
    Authors
    Bralten, J.B. (Radboud University); Roth Mota, N. (Radboud University); Klemann, C.J.H.M. (Radboud University); Witte, W. de
    Area covered
    northlimit=59.62358300012501; eastlimit=2.374666058072581; southlimit=49.568413008749225; westlimit=-8.205608345022652United Kingdom
    Description

    Levels of sociability are continuously distributed in the general population, and decreased sociability represents an early manifestation of several brain disorders. Here, we investigated the genetic underpinnings of sociability in the population.

    Main question of our research: 1. Are there common genetic variants that are associated with sociability in the general population? 2. Are genetic variants that are associated with sociability also associated with neuropsychiatric disorders?

    Type of data uploaded in this repository: The UK Biobank project (see https://www.ukbiobank.ac.uk/) is a large-scale biomedical database and research resource, containing in-depth genetic and health information from half a million UK participants. The database is globally accessible to approved researchers undertaking vital research into the most common and life-threatening diseases. The raw data that this project is based on comes from the publically available UK Biobank set, which is very large and is therefore not provided here. Here we only provide the results from our analysis, that is also described here: https://www.biorxiv.org/content/10.1101/781195v2 and currently in revision in a scientific journal. In the dataset you will find the association of 9327396 genetic variants with the phenotype sociability. This dataset is not applicable to be opened with Excel, and can best be opened on a cluster computer or using specfic software.

    Subjects The UK Biobank (UKBB) is a major population-based cohort from the United Kingdom that includes individuals aged between 37 and 73 years. We constructed a sociability measure based on the the aggregation of scores per participant on four questions from the UKBB database that link to sociability, including (1) a question about the frequency of friend/family visits, (2) a question on the number and type of social venues that are visited, (3) a question about worrying after social embarrassment and (4) a question about feeling lonely, leading to a sociability score ranging from 0-4. Participants were excluded if they had somatic problems that could be related to social withdrawal (BMI < 15 or BMI > 40, narcolepsy (all the time), stroke, severe tinnitus, deafness or brain-related cancers) or if they answered that they had “No friends/family outside household” or “Do not know” or “Prefer not to answer” to any of the questions.

    SNP genotyping and quality control Details about the available genome-wide genotyping data for UKBB participants have been reported previously (PMID: 30305743). We used third-release genotyping data (see https://biobank.ctsu.ox.ac.uk/crystal/label.cgi?id=100319). Briefly, 49,950 participants were genotyped using the UK BiLEVE Axiom Array and 438,427 participants were genotyped using UK Biobank Axiom Array. Genotypes were imputed into the dataset using the Haplotype Reference Consortium (HRC), and the UK10K haplotype resource. To account for ethnicity, we included only those individuals that identified themselves as "white" by self-report and plotted the Principal Components (PC) provided by the UKBB, excluding individuals considered to be outliers according to PCs 1 and 2. Genetic relatedness calculated with KING kinship and provided by the UKBB (https://kenhanscombe.github.io/ukbtools/articles/explore-ukb-data.html ; http://www.ukbiobank.ac.uk/wp-content/uploads/2014/04/UKBiobank_genotyping_QC_documentation-web.pdf) was used to identify first and second-degree relatives. Subsequently ´families´ (i.e. clusters of related individuals above an IBD>0.125 threshold) were created and only one individual from each of these created ‘families’ was included in the analysis. If self-reported sex and SNP-based sex differed, individuals were excluded from further analysis. Single nucleotide polymorphisms (SNPs) with minor allele frequency <0.005, Hardy-Weinberg equilibrium test P value<1e−6, missing genotype rate >0.05, and imputation quality of INFO <0.8 were excluded. In the current study, all analyses are based on 342,461 participants of European ancestry for which both genotype data and sociability scores were available.

    Genome-wide association analysis Genome-wide association analysis with the imputed marker dosages was performed in PLINK1.9, using a linear regression model with the sociability measure as the dependent variable and including sex, age, 10 first PCs, assessment center, and genotype batch as covariates. SNPs were considered significantly associated if they had p-value < 5e-8. Associated loci were considered independent of each other at r2 0.6 and lead SNPs were classified as the SNP with the smallest association p-value and at r2 0.1, using a 250kb window. The summary statistics come from the plink2 linear regression analysis.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2018). Genome Aggregation Database [Dataset]. http://identifiers.org/RRID:SCR_014964

Genome Aggregation Database

RRID:SCR_014964, biotools:gnomad, Genome Aggregation Database (RRID:SCR_014964), gnomAD, gnomAD 2.0, gnomAD Browser, gnomAD version 2.0, Exome Aggregation Consortium

Explore at:
76 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Jul 19, 2018
Description

Database that aggregates exome and genome sequencing data from large-scale sequencing projects. The gnomAD data set contains individuals sequenced using multiple exome capture methods and sequencing chemistries. Raw data from the projects have been reprocessed through the same pipeline, and jointly variant-called to increase consistency across projects.

Search
Clear search
Close search
Google apps
Main menu