1 dataset found

gnomAD
console.cloud.google.com
Updated Jun 23, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
https://console.cloud.google.com/marketplace/browse?filter=partner:Broad%20Institute%20of%20MIT%20and%20Harvard&inv=1&invt=Ab29og (2020). gnomAD [Dataset]. https://console.cloud.google.com/marketplace/product/broad-institute/gnomad
Explore at:
Dataset updated
Jun 23, 2020
Dataset provided by
Googlehttp://google.com/
Description
The Genome Aggregation Database (gnomAD) is maintained by an international coalition of investigators to aggregate and harmonize data from large-scale sequencing projects. These public datasets are available in VCF format in Google Cloud Storage and in Google BigQuery as integer range partitioned tables . Each dataset is sharded by chromosome meaning variants are distributed across 24 tables (indicated with “_chr*” suffix). Utilizing the sharded tables reduces query costs significantly. Variant Transforms was used to process these VCF files and import them to BigQuery. VEP annotations were parsed into separate columns for easier analysis using Variant Transforms’ annotation support . These public datasets are included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. Use this quick start guide to quickly learn how to access public datasets on Google Cloud Storage. Find out more in our blog post, Providing open access to gnomAD on Google Cloud . Questions? Contact gcp-life-sciences-discuss@googlegroups.com.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

https://console.cloud.google.com/marketplace/browse?filter=partner:Broad%20Institute%20of%20MIT%20and%20Harvard&inv=1&invt=Ab29og (2020). gnomAD [Dataset]. https://console.cloud.google.com/marketplace/product/broad-institute/gnomad

gnomAD

Explore at:

Dataset updated

Jun 23, 2020

Dataset provided by

Googlehttp://google.com/

Description

The Genome Aggregation Database (gnomAD) is maintained by an international coalition of investigators to aggregate and harmonize data from large-scale sequencing projects. These public datasets are available in VCF format in Google Cloud Storage and in Google BigQuery as integer range partitioned tables . Each dataset is sharded by chromosome meaning variants are distributed across 24 tables (indicated with “_chr*” suffix). Utilizing the sharded tables reduces query costs significantly. Variant Transforms was used to process these VCF files and import them to BigQuery. VEP annotations were parsed into separate columns for easier analysis using Variant Transforms’ annotation support . These public datasets are included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. Use this quick start guide to quickly learn how to access public datasets on Google Cloud Storage. Find out more in our blog post, Providing open access to gnomAD on Google Cloud . Questions? Contact gcp-life-sciences-discuss@googlegroups.com.

Clear search

Close search

Google apps

Main menu