100+ datasets found
  1. BigQuery Fintech Dataset

    • kaggle.com
    Updated Aug 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mustafa Keser (2024). BigQuery Fintech Dataset [Dataset]. https://www.kaggle.com/datasets/mustafakeser4/bigquery-fintech-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 17, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Mustafa Keser
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset: cloud-training-demos.fintech

    This dataset, hosted on BigQuery, is designed for financial technology (fintech) training and analysis. It comprises six interconnected tables, each providing detailed insights into various aspects of customer loans, loan purposes, and regional distributions. The dataset is ideal for practicing SQL queries, building data models, and conducting financial analytics.

    Tables:

    1. customer:
      Contains records of individual customers, including demographic details and unique customer IDs. This table serves as a primary reference for analyzing customer behavior and loan distribution.

    2. loan:
      Includes detailed information about each loan issued, such as the loan amount, interest rate, and tenure. The table is crucial for analyzing lending patterns and financial outcomes.

    3. loan_count_by_year:
      Provides aggregated loan data by year, offering insights into yearly lending trends. This table helps in understanding the temporal dynamics of loan issuance.

    4. loan_purposes:
      Lists various reasons or purposes for which loans were issued, along with corresponding loan counts. This data can be used to analyze customer needs and market demands.

    5. loan_with_region:
      Combines loan data with regional information, allowing for geographical analysis of lending activities. This table is key for regional market analysis and understanding how loan distribution varies across different areas.

    6. state_region:
      Maps state names to their respective regions, enabling a more granular geographical analysis when combined with other tables in the dataset.

    Use Cases:

    • Customer Segmentation: Analyze customer data to identify distinct segments based on demographics and loan behaviors.
    • Loan Analysis: Explore loan issuance patterns, interest rates, and purposes to uncover trends and insights.
    • Regional Analysis: Combine loan and region data to understand how loan distributions vary by geography.
    • Temporal Trends: Utilize the loan_count_by_year table to observe how lending patterns evolve over time.

    This dataset is ideal for those looking to enhance their skills in SQL, financial data analysis, and BigQuery, providing a comprehensive foundation for fintech-related projects and case studies.

  2. Cyclistic. Data Analysis in SQL BigQuery

    • kaggle.com
    zip
    Updated Aug 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maryia Lvouskaya (2023). Cyclistic. Data Analysis in SQL BigQuery [Dataset]. https://www.kaggle.com/datasets/maryialvouskaya/cyclistic-data-analysis-in-sql-bigquery
    Explore at:
    zip(634034907 bytes)Available download formats
    Dataset updated
    Aug 27, 2023
    Authors
    Maryia Lvouskaya
    Description

    There are cleaned datasets from the fictional bike-sharing company 'Cyclistic,' consisting of original data from the Divvy Bikes company. These datasets correspond to the period from 01/01/2020 to 30/06/2023. I obtained the original data from the following link: https://divvy-tripdata.s3.amazonaws.com/index.html. The files that I uploaded are original but have undergone the cleaning process in R. The data is properly licensed, well-organized, and dependable.

  3. D

    Data Warehousing Market Report

    • marketreportanalytics.com
    doc, pdf, ppt
    Updated Mar 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Market Report Analytics (2025). Data Warehousing Market Report [Dataset]. https://www.marketreportanalytics.com/reports/data-warehousing-market-10805
    Explore at:
    pdf, ppt, docAvailable download formats
    Dataset updated
    Mar 19, 2025
    Dataset authored and provided by
    Market Report Analytics
    License

    https://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Data Warehousing market is booming, projected to reach $88.4 billion by 2033 with a 13.64% CAGR. Explore key trends, leading companies like Snowflake & Databricks, and regional insights in this comprehensive market analysis. Discover how cloud-based solutions, big data analytics, and increasing data volumes are driving growth.

  4. A

    Analytical Data Store Tools Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Jun 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Analytical Data Store Tools Report [Dataset]. https://www.datainsightsmarket.com/reports/analytical-data-store-tools-506701
    Explore at:
    ppt, doc, pdfAvailable download formats
    Dataset updated
    Jun 17, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    Discover the booming Analytical Data Store Tools market! This comprehensive analysis reveals a $50 billion market in 2025, projected to reach $150 billion by 2033 at a 15% CAGR. Learn about key drivers, trends, and top players like Snowflake, Google, and Microsoft, and gain insights into regional market shares.

  5. Google Analytics Sample

    • kaggle.com
    zip
    Updated Sep 19, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The citation is currently not available for this dataset.
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Sep 19, 2019
    Dataset provided by
    BigQueryhttps://cloud.google.com/bigquery
    Googlehttp://google.com/
    Authors
    Google BigQuery
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    The Google Merchandise Store sells Google branded merchandise. The data is typical of what you would see for an ecommerce website.

    Content

    The sample dataset contains Google Analytics 360 data from the Google Merchandise Store, a real ecommerce store. The Google Merchandise Store sells Google branded merchandise. The data is typical of what you would see for an ecommerce website. It includes the following kinds of information:

    Traffic source data: information about where website visitors originate. This includes data about organic traffic, paid search traffic, display traffic, etc. Content data: information about the behavior of users on the site. This includes the URLs of pages that visitors look at, how they interact with content, etc. Transactional data: information about the transactions that occur on the Google Merchandise Store website.

    Fork this kernel to get started.

    Acknowledgements

    Data from: https://bigquery.cloud.google.com/table/bigquery-public-data:google_analytics_sample.ga_sessions_20170801

    Banner Photo by Edho Pratama from Unsplash.

    Inspiration

    What is the total number of transactions generated per device browser in July 2017?

    The real bounce rate is defined as the percentage of visits with a single pageview. What was the real bounce rate per traffic source?

    What was the average number of product pageviews for users who made a purchase in July 2017?

    What was the average number of product pageviews for users who did not make a purchase in July 2017?

    What was the average total transactions per user that made a purchase in July 2017?

    What is the average amount of money spent per session in July 2017?

    What is the sequence of pages viewed?

  6. Looker Ecommerce BigQuery Dataset

    • kaggle.com
    Updated Jan 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mustafa Keser (2024). Looker Ecommerce BigQuery Dataset [Dataset]. https://www.kaggle.com/datasets/mustafakeser4/looker-ecommerce-bigquery-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 18, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Mustafa Keser
    Description

    Looker Ecommerce Dataset Description

    CSV version of Looker Ecommerce Dataset.

    Overview Dataset in BigQuery TheLook is a fictitious eCommerce clothing site developed by the Looker team. The dataset contains information >about customers, products, orders, logistics, web events and digital marketing campaigns. The contents of this >dataset are synthetic, and are provided to industry practitioners for the purpose of product discovery, testing, and >evaluation. This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This >means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on >this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public >datasets.

    1. distribution_centers.csv

    • Columns:
      • id: Unique identifier for each distribution center.
      • name: Name of the distribution center.
      • latitude: Latitude coordinate of the distribution center.
      • longitude: Longitude coordinate of the distribution center.

    2. events.csv

    • Columns:
      • id: Unique identifier for each event.
      • user_id: Identifier for the user associated with the event.
      • sequence_number: Sequence number of the event.
      • session_id: Identifier for the session during which the event occurred.
      • created_at: Timestamp indicating when the event took place.
      • ip_address: IP address from which the event originated.
      • city: City where the event occurred.
      • state: State where the event occurred.
      • postal_code: Postal code of the event location.
      • browser: Web browser used during the event.
      • traffic_source: Source of the traffic leading to the event.
      • uri: Uniform Resource Identifier associated with the event.
      • event_type: Type of event recorded.

    3. inventory_items.csv

    • Columns:
      • id: Unique identifier for each inventory item.
      • product_id: Identifier for the associated product.
      • created_at: Timestamp indicating when the inventory item was created.
      • sold_at: Timestamp indicating when the item was sold.
      • cost: Cost of the inventory item.
      • product_category: Category of the associated product.
      • product_name: Name of the associated product.
      • product_brand: Brand of the associated product.
      • product_retail_price: Retail price of the associated product.
      • product_department: Department to which the product belongs.
      • product_sku: Stock Keeping Unit (SKU) of the product.
      • product_distribution_center_id: Identifier for the distribution center associated with the product.

    4. order_items.csv

    • Columns:
      • id: Unique identifier for each order item.
      • order_id: Identifier for the associated order.
      • user_id: Identifier for the user who placed the order.
      • product_id: Identifier for the associated product.
      • inventory_item_id: Identifier for the associated inventory item.
      • status: Status of the order item.
      • created_at: Timestamp indicating when the order item was created.
      • shipped_at: Timestamp indicating when the order item was shipped.
      • delivered_at: Timestamp indicating when the order item was delivered.
      • returned_at: Timestamp indicating when the order item was returned.

    5. orders.csv

    • Columns:
      • order_id: Unique identifier for each order.
      • user_id: Identifier for the user who placed the order.
      • status: Status of the order.
      • gender: Gender information of the user.
      • created_at: Timestamp indicating when the order was created.
      • returned_at: Timestamp indicating when the order was returned.
      • shipped_at: Timestamp indicating when the order was shipped.
      • delivered_at: Timestamp indicating when the order was delivered.
      • num_of_item: Number of items in the order.

    6. products.csv

    • Columns:
      • id: Unique identifier for each product.
      • cost: Cost of the product.
      • category: Category to which the product belongs.
      • name: Name of the product.
      • brand: Brand of the product.
      • retail_price: Retail price of the product.
      • department: Department to which the product belongs.
      • sku: Stock Keeping Unit (SKU) of the product.
      • distribution_center_id: Identifier for the distribution center associated with the product.

    7. users.csv

    • Columns:
      • id: Unique identifier for each user.
      • first_name: First name of the user.
      • last_name: Last name of the user.
      • email: Email address of the user.
      • age: Age of the user.
      • gender: Gender of the user.
      • state: State where t...
  7. Google Analytics Sample

    • console.cloud.google.com
    Updated Jul 15, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    https://console.cloud.google.com/marketplace/browse?filter=partner:Obfuscated%20Google%20Analytics%20360%20data&hl=en_GB (2017). Google Analytics Sample [Dataset]. https://console.cloud.google.com/marketplace/product/obfuscated-ga360-data/obfuscated-ga360-data?hl=en_GB
    Explore at:
    Dataset updated
    Jul 15, 2017
    Dataset provided by
    Googlehttp://google.com/
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    The dataset provides 12 months (August 2016 to August 2017) of obfuscated Google Analytics 360 data from the Google Merchandise Store , a real ecommerce store that sells Google-branded merchandise, in BigQuery. It’s a great way analyze business data and learn the benefits of using BigQuery to analyze Analytics 360 data Learn more about the data The data includes The data is typical of what an ecommerce website would see and includes the following information:Traffic source data: information about where website visitors originate, including data about organic traffic, paid search traffic, and display trafficContent data: information about the behavior of users on the site, such as URLs of pages that visitors look at, how they interact with content, etc. Transactional data: information about the transactions on the Google Merchandise Store website.Limitations: All users have view access to the dataset. This means you can query the dataset and generate reports but you cannot complete administrative tasks. Data for some fields is obfuscated such as fullVisitorId, or removed such as clientId, adWordsClickInfo and geoNetwork. “Not available in demo dataset” will be returned for STRING values and “null” will be returned for INTEGER values when querying the fields containing no data.This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery

  8. F

    Financial Cloud Data Warehouse Solutions Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated May 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Financial Cloud Data Warehouse Solutions Report [Dataset]. https://www.datainsightsmarket.com/reports/financial-cloud-data-warehouse-solutions-1460504
    Explore at:
    pdf, ppt, docAvailable download formats
    Dataset updated
    May 2, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Financial Cloud Data Warehouse Solutions market is booming, projected to reach $45 billion by 2033 with a 15% CAGR. Learn about key drivers, market trends, top companies (Snowflake, AWS, Microsoft Azure), and regional insights in this comprehensive market analysis. Discover how cloud-based solutions are transforming financial data management.

  9. C

    Cloud Data Warehouse Solutions Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Aug 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Cloud Data Warehouse Solutions Report [Dataset]. https://www.datainsightsmarket.com/reports/cloud-data-warehouse-solutions-1385894
    Explore at:
    doc, pdf, pptAvailable download formats
    Dataset updated
    Aug 15, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Cloud Data Warehouse (CDW) solutions market is experiencing robust growth, driven by the increasing need for scalable, cost-effective, and secure data storage and analytics solutions across various industries. The market's expansion is fueled by several factors, including the proliferation of big data, the rise of cloud computing adoption, and the growing demand for real-time business intelligence. Organizations are migrating from on-premise data warehouses to cloud-based solutions to leverage the benefits of scalability, elasticity, and pay-as-you-go pricing models. This shift is further accelerated by the increasing complexity of data management and the need for advanced analytics capabilities to gain actionable insights from vast datasets. Competition is fierce, with major players like Amazon Redshift, Snowflake, Google Cloud, and Microsoft Azure Synapse leading the market, each offering unique strengths and capabilities. However, the market also witnesses the emergence of niche players catering to specific industry needs or geographical regions. The overall market is segmented based on deployment models (public, private, hybrid), service models (SaaS, PaaS, IaaS), and industry verticals (finance, healthcare, retail, etc.). Future growth will likely be influenced by advancements in technologies such as AI, machine learning, and serverless computing, further enhancing the analytical capabilities of CDW solutions. The projected Compound Annual Growth Rate (CAGR) suggests a substantial increase in market value over the forecast period (2025-2033). Assuming a conservative CAGR of 15% (a reasonable estimate considering the rapid technological advancements in this space), and a 2025 market size of $50 billion (a reasonable estimate based on industry reports), the market is poised for significant expansion. This growth will be influenced by factors such as increasing data volumes, advancements in data analytics techniques, and the growing adoption of cloud-based technologies by small and medium-sized businesses (SMBs). Despite the rapid growth, challenges remain, including data security concerns, integration complexities, and vendor lock-in. However, continuous innovation and the development of robust security measures will mitigate these challenges, paving the way for sustained market growth in the coming years.

  10. SAP DATASET | BigQuery Dataset

    • kaggle.com
    zip
    Updated Aug 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mustafa Keser (2024). SAP DATASET | BigQuery Dataset [Dataset]. https://www.kaggle.com/datasets/mustafakeser4/sap-dataset-bigquery-dataset/discussion
    Explore at:
    zip(365940125 bytes)Available download formats
    Dataset updated
    Aug 20, 2024
    Authors
    Mustafa Keser
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Certainly! Here's a description for the Kaggle dataset related to the cloud-training-demos.SAP_REPLICATED_DATA BigQuery public dataset:

    Dataset Description: SAP Replicated Data

    Dataset ID: cloud-training-demos.SAP_REPLICATED_DATA

    Overview: The SAP_REPLICATED_DATA dataset in BigQuery provides a comprehensive replication of SAP (Systems, Applications, and Products in Data Processing) business data. This dataset is designed to support data analytics and machine learning tasks by offering a rich set of structured data that mimics real-world enterprise scenarios. It includes data from various SAP modules and processes, enabling users to perform in-depth analysis, build predictive models, and explore business insights.

    Content: - Tables and Schemas: The dataset consists of multiple tables representing different aspects of SAP business operations, including but not limited to sales, inventory, finance, and procurement data. - Data Types: It contains structured data with fields such as transaction IDs, timestamps, customer details, product information, sales figures, and financial metrics. - Data Volume: The dataset is designed to simulate large-scale enterprise data, making it suitable for performance testing, data processing, and analysis.

    Usage: - Business Analytics: Users can analyze business trends, sales performance, and financial metrics. - Machine Learning: Ideal for developing and testing machine learning models related to business forecasting, anomaly detection, and customer segmentation. - Data Processing: Suitable for practicing SQL queries, data transformation, and integration tasks.

    Example Use Cases: - Sales Analysis: Track and analyze sales performance across different regions and time periods. - Inventory Management: Monitor inventory levels and identify trends in stock movements. - Financial Reporting: Generate financial reports and analyze expense patterns.

    For more information and to access the dataset, visit the BigQuery public datasets page or refer to the dataset documentation in the BigQuery console.

    Tables:

    Here's a Markdown table with the information you provided:

    File NameDescription
    adr6.csvAddresses with organizational units. Contains address details related to organizational units like departments or branches.
    adrc.csvGeneral Address Data. Provides information about addresses, including details such as street, city, and postal codes.
    adrct.csvAddress Contact Information. Contains contact information linked to addresses, including phone numbers and email addresses.
    adrt.csvAddress Details. Includes detailed address data such as street addresses, city, and country codes.
    ankt.csvAccounting Document Segment. Provides details on segments within accounting documents, including account numbers and amounts.
    anla.csvAsset Master Data. Contains information about fixed assets, including asset identification and classification.
    bkpf.csvAccounting Document Header. Contains headers of accounting documents, such as document numbers and fiscal year.
    bseg.csvAccounting Document Segment. Details line items within accounting documents, including account details and amounts.
    but000.csvBusiness Partners. Contains basic information about business partners, including IDs and names.
    but020.csvBusiness Partner Addresses. Provides address details associated with business partners.
    cepc.csvCustomer Master Data - Central. Contains centralized data for customer master records.
    cepct.csvCustomer Master Data - Contact. Provides contact details associated with customer records.
    csks.csvCost Center Master Data. Contains data about cost centers within the organization.
    cskt.csvCost Center Texts. Provides text descriptions and labels for cost centers.
    dd03l.csvData Element Field Labels. Contains labels and descriptions for data fields in the SAP system.
    ekbe.csvPurchase Order History. Details history of purchase orders, including quantities and values.
    ekes.csvPurchasing Document History. Contains history of purchasing documents including changes and statuses.
    eket.csvPurchase Order Item History. Details changes and statuses for individual purchase order items.
    ekkn.csvPurchase Order Account Assignment. Provides account assignment details for purchas...
  11. 1000 Cannabis Genomes Project

    • kaggle.com
    zip
    Updated Feb 26, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Google BigQuery (2019). 1000 Cannabis Genomes Project [Dataset]. https://www.kaggle.com/bigquery/genomics-cannabis
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Feb 26, 2019
    Dataset provided by
    BigQueryhttps://cloud.google.com/bigquery
    Googlehttp://google.com/
    Authors
    Google BigQuery
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    Cannabis is a genus of flowering plants in the family Cannabaceae.

    Source: https://en.wikipedia.org/wiki/Cannabis

    Content

    In October 2016, Phylos Bioscience released a genomic open dataset of approximately 850 strains of Cannabis via the Open Cannabis Project. In combination with other genomics datasets made available by Courtagen Life Sciences, Michigan State University, NCBI, Sunrise Medicinal, University of Calgary, University of Toronto, and Yunnan Academy of Agricultural Sciences, the total amount of publicly available data exceeds 1,000 samples taken from nearly as many unique strains.

    https://medium.com/google-cloud/dna-sequencing-of-1000-cannabis-strains-publicly-available-in-google-bigquery-a33430d63998

    These data were retrieved from the National Center for Biotechnology Information’s Sequence Read Archive (NCBI SRA), processed using the BWA aligner and FreeBayes variant caller, indexed with the Google Genomics API, and exported to BigQuery for analysis. Data are available directly from Google Cloud Storage at gs://gcs-public-data--genomics/cannabis, as well as via the Google Genomics API as dataset ID 918853309083001239, and an additional duplicated subset of only transcriptome data as dataset ID 94241232795910911, as well as in the BigQuery dataset bigquery-public-data:genomics_cannabis.

    All tables in the Cannabis Genomes Project dataset have a suffix like _201703. The suffix is referred to as [BUILD_DATE] in the descriptions below. The dataset is updated frequently as new releases become available.

    The following tables are included in the Cannabis Genomes Project dataset:

    Sample_info contains fields extracted for each SRA sample, including the SRA sample ID and other data that give indications about the type of sample. Sample types include: strain, library prep methods, and sequencing technology. See SRP008673 for an example of upstream sample data. SRP008673 is the University of Toronto sequencing of Cannabis Sativa subspecies Purple Kush.

    MNPR01_reference_[BUILD_DATE] contains reference sequence names and lengths for the draft assembly of Cannabis Sativa subspecies Cannatonic produced by Phylos Bioscience. This table contains contig identifiers and their lengths.

    MNPR01_[BUILD_DATE] contains variant calls for all included samples and types (genomic, transcriptomic) aligned to the MNPR01_reference_[BUILD_DATE] table. Samples can be found in the sample_info table. The MNPR01_[BUILD_DATE] table is exported using the Google Genomics BigQuery variants schema. This table is useful for general analysis of the Cannabis genome.

    MNPR01_transcriptome_[BUILD_DATE] is similar to the MNPR01_[BUILD_DATE] table, but it includes only the subset transcriptomic samples. This table is useful for transcribed gene-level analysis of the Cannabis genome.

    Fork this kernel to get started with this dataset.

    Acknowledgements

    Dataset Source: http://opencannabisproject.org/ Category: Genomics Use: This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - https://www.ncbi.nlm.nih.gov/home/about/policies.shtml - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset. Update frequency: As additional data are released to GenBank View in BigQuery: https://bigquery.cloud.google.com/dataset/bigquery-public-data:genomics_cannabis View in Google Cloud Storage: gs://gcs-public-data--genomics/cannabis

    Banner Photo by Rick Proctor from Unplash.

    Inspiration

    Which Cannabis samples are included in the variants table?

    Which contigs in the MNPR01_reference_[BUILD_DATE] table have the highest density of variants?

    How many variants does each sample have at the THC Synthase gene (THCA1) locus?

  12. S3_GDELT BigQuery

    • figshare.com
    html
    Updated May 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    INNOCENSIA OWUOR (2022). S3_GDELT BigQuery [Dataset]. http://doi.org/10.6084/m9.figshare.19729708.v2
    Explore at:
    htmlAvailable download formats
    Dataset updated
    May 8, 2022
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    INNOCENSIA OWUOR
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    GDELT query used to retrieve data from Google BigQuery

  13. Ethereum Blockchain

    • console.cloud.google.com
    Updated Nov 26, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    https://console.cloud.google.com/marketplace/browse?filter=partner:BigQuery%20Public%20Data&hl=de (2023). Ethereum Blockchain [Dataset]. https://console.cloud.google.com/marketplace/product/bigquery-public-data/blockchain-analytics-ethereum-mainnet-us?hl=de
    Explore at:
    Dataset updated
    Nov 26, 2023
    Dataset provided by
    BigQueryhttps://cloud.google.com/bigquery
    Googlehttp://google.com/
    Description

    This dataset surfaces data from the Ethereum blockchain and includes tables for blocks, transactions, logs, and more. Ethereum is a decentralized open-source blockchain system that features its own cryptocurrency, Ether. A blockchain is an ever-growing tree of blocks. Each block contains a number of transactions. For more information, see the Blockchain Analytics documentation .

  14. B

    Big Data Processing And Distribution Systems Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Jul 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Big Data Processing And Distribution Systems Report [Dataset]. https://www.datainsightsmarket.com/reports/big-data-processing-and-distribution-systems-528339
    Explore at:
    ppt, pdf, docAvailable download formats
    Dataset updated
    Jul 6, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Big Data Processing and Distribution Systems market is experiencing robust growth, driven by the exponential increase in data volume across various industries. The market, estimated at $50 billion in 2025, is projected to witness a Compound Annual Growth Rate (CAGR) of 15% from 2025 to 2033, reaching approximately $150 billion by 2033. This expansion is fueled by several key factors. The rising adoption of cloud-based solutions, offering scalability and cost-effectiveness, is a significant driver. Furthermore, the increasing demand for real-time analytics and advanced data processing capabilities across sectors like finance, healthcare, and e-commerce are propelling market growth. The emergence of new technologies such as edge computing and AI-powered analytics is further accelerating the adoption of sophisticated big data processing solutions. However, market growth is not without its challenges. Data security and privacy concerns, coupled with the complexity of implementing and managing big data systems, remain significant restraints. The need for specialized skills and expertise in data science and engineering also contributes to the overall cost and complexity of adoption. Despite these challenges, the market's continued expansion is anticipated, driven by the persistent need for efficient and insightful data management in an increasingly data-driven world. Segmentation within the market is diverse, encompassing various solutions including cloud-based platforms, on-premise systems, and specialized tools for data integration, processing, and visualization. Leading players such as Google, AWS, Microsoft, Snowflake, and Databricks are fiercely competing to capture market share, further stimulating innovation and driving market expansion.

  15. C

    Cloud Data Warehouse Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Jul 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Cloud Data Warehouse Report [Dataset]. https://www.datainsightsmarket.com/reports/cloud-data-warehouse-1958553
    Explore at:
    doc, ppt, pdfAvailable download formats
    Dataset updated
    Jul 4, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The cloud data warehouse market is experiencing robust growth, driven by the increasing need for scalable, cost-effective, and readily accessible data analytics solutions. The market's expansion is fueled by several key factors, including the burgeoning adoption of cloud computing across various industries, the proliferation of big data, and the growing demand for real-time business intelligence. Organizations are migrating from on-premise data warehouses to cloud-based solutions to leverage enhanced scalability, reduced infrastructure costs, and improved agility. This shift is further accelerated by the availability of advanced analytics tools and services within the cloud ecosystem, enabling businesses to derive actionable insights from their data more efficiently. Competitive pressures and the need to gain a competitive edge are also significant drivers, pushing enterprises to adopt sophisticated data warehousing solutions capable of handling complex analytical workloads. The market is highly fragmented, with major players such as Amazon, Google, Microsoft, and others competing intensely through innovation, strategic partnerships, and aggressive pricing strategies. While the market shows significant promise, certain challenges persist. Data security and privacy concerns remain a major obstacle to wider adoption, particularly in regulated industries. Integration complexities with existing on-premise systems and the need for skilled professionals to manage and maintain cloud data warehouses also present hurdles. However, ongoing technological advancements in areas such as data encryption, access control, and automated data integration are mitigating these challenges. Furthermore, the emergence of new technologies, such as serverless architectures and AI-powered analytics, is continuously reshaping the market landscape, fostering innovation and expanding the market's potential. Over the forecast period (2025-2033), consistent growth is anticipated, fueled by ongoing digital transformation initiatives across various sectors. We estimate a conservative CAGR (considering industry averages for similar tech sectors) of 15% over this period, indicating substantial growth opportunities.

  16. A

    Analytics Query Accelerator Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Aug 15, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Analytics Query Accelerator Report [Dataset]. https://www.datainsightsmarket.com/reports/analytics-query-accelerator-531112
    Explore at:
    ppt, pdf, docAvailable download formats
    Dataset updated
    Aug 15, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Analytics Query Accelerator (AQA) market is experiencing robust growth, driven by the increasing demand for real-time insights from massive datasets across various industries. The market, estimated at $15 billion in 2025, is projected to achieve a Compound Annual Growth Rate (CAGR) of 20% from 2025 to 2033, reaching an estimated $70 billion by 2033. This expansion is fueled by several key factors. Firstly, the proliferation of big data and the need for rapid data analysis across sectors like finance, healthcare, and e-commerce are creating significant demand. Secondly, advancements in cloud computing and distributed database technologies are enabling faster query processing and improved performance of AQAs. Finally, the rising adoption of advanced analytics techniques such as machine learning and artificial intelligence is further driving the need for efficient query acceleration solutions. Key players like Google, Amazon, Snowflake, Microsoft, Databricks, Teradata, and Cloudera are actively competing in this rapidly evolving landscape, investing heavily in R&D and strategic partnerships to maintain market leadership. The growth trajectory of the AQA market is further shaped by emerging trends such as the increasing adoption of serverless computing and the expansion of edge analytics. However, challenges remain, including the complexity of implementing and managing AQA solutions, the need for skilled professionals, and concerns related to data security and privacy. Despite these restraints, the long-term outlook for the AQA market remains exceptionally positive, fueled by continuous technological innovations and the ever-increasing reliance on data-driven decision-making across all industries. The market segmentation is likely diversified across various deployment models (cloud, on-premise), data types (structured, unstructured), and industry verticals. This diverse landscape presents numerous opportunities for both established players and emerging companies to capture market share.

  17. noaa-global-forecast-system

    • console.cloud.google.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    https://console.cloud.google.com/marketplace/browse?filter=partner:BigQuery%20Public%20Data, noaa-global-forecast-system [Dataset]. https://console.cloud.google.com/marketplace/product/bigquery-public-data/noaa-global-forecast-system
    Explore at:
    Dataset provided by
    BigQueryhttps://cloud.google.com/bigquery
    Googlehttp://google.com/
    Description

    The Global Forecast System (GFS) is a weather forecast model produced by the National Centers for Environmental Prediction (NCEP). The GFS dataset consists of selected model outputs (described below) as gridded forecast variables. The 384-hour forecasts, with 3-hour forecast interval, are made at 6-hour temporal resolution (i.e. updated four times daily). Use the 'creation_time' and 'forecast_time' properties to select data of interest. The GFS is a coupled model, composed of an atmosphere model, an ocean model, a land/soil model, and a sea ice model which work together to provide an accurate picture of weather conditions. See history of recent modifications to the global forecast/analysis system , the model performance statistical web page , and the documentation homepage for more information.Learn more

  18. MultiversX Blockchain

    • console.cloud.google.com
    Updated Jan 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    https://console.cloud.google.com/marketplace/browse?filter=partner:BigQuery%20Public%20Data (2024). MultiversX Blockchain [Dataset]. https://console.cloud.google.com/marketplace/product/bigquery-public-data/blockchain-analytics-multiversx-mainnet-eu
    Explore at:
    Dataset updated
    Jan 10, 2024
    Dataset provided by
    BigQueryhttps://cloud.google.com/bigquery
    Googlehttp://google.com/
    Description

    MultiversX is a highly scalable, secure and decentralized blockchain network created to enable radically new applications, for users, businesses, society, and the new metaverse frontier. This dataset is one of many crypto datasets that are available within Google Cloud Public Datasets . As with other Google Cloud public datasets, you can query this dataset for free, up to 1TB/month of free processing, every month. Watch this short video to learn how to get started with the public datasets.

  19. SEC Public Dataset

    • console.cloud.google.com
    Updated Jul 19, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    https://console.cloud.google.com/marketplace/browse?filter=partner:U.S.%20Securities%20and%20Exchange%20Commission&hl=en_GB (2023). SEC Public Dataset [Dataset]. https://console.cloud.google.com/marketplace/product/sec-public-data-bq/sec-public-dataset?hl=en_GB
    Explore at:
    Dataset updated
    Jul 19, 2023
    Dataset provided by
    Googlehttp://google.com/
    Description

    In the U.S. public companies, certain insiders and broker-dealers are required to regularly file with the SEC. The SEC makes this data available online for anybody to view and use via their Electronic Data Gathering, Analysis, and Retrieval (EDGAR) database. The SEC updates this data every quarter going back to January, 2009. To aid analysis a quick summary view of the data has been created that is not available in the original dataset. The quick summary view pulls together signals into a single table that otherwise would have to be joined from multiple tables and enables a more streamlined user experience. This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets.Learn more

  20. C

    Cloud Analytics Market Report

    • promarketreports.com
    doc, pdf, ppt
    Updated Jan 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pro Market Reports (2025). Cloud Analytics Market Report [Dataset]. https://www.promarketreports.com/reports/cloud-analytics-market-8915
    Explore at:
    ppt, doc, pdfAvailable download formats
    Dataset updated
    Jan 16, 2025
    Dataset authored and provided by
    Pro Market Reports
    License

    https://www.promarketreports.com/privacy-policyhttps://www.promarketreports.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The size of the Cloud Analytics Market was valued at USD 23.82 billion in 2023 and is projected to reach USD 82.22 billion by 2032, with an expected CAGR of 19.36% during the forecast period. The cloud analytics market has witnessed significant growth, driven by the rising demand for data-driven decision-making and the increasing adoption of cloud computing technologies. Organizations are leveraging cloud analytics to process and analyze vast amounts of structured and unstructured data, enabling them to gain actionable insights and improve operational efficiency. The market's expansion is fueled by the scalability, cost-effectiveness, and real-time capabilities of cloud-based solutions compared to traditional on-premises systems. Industries such as retail, healthcare, banking, and IT are increasingly integrating cloud analytics into their operations to enhance customer experiences, optimize supply chains, and mitigate risks. Furthermore, advancements in artificial intelligence and machine learning are augmenting the analytical capabilities of cloud platforms, allowing businesses to forecast trends and automate complex processes. The growing popularity of hybrid and multi-cloud environments is also contributing to the market's growth by offering flexibility and addressing data security concerns. As organizations continue to prioritize digital transformation and data utilization, the cloud analytics market is poised for sustained expansion, driven by technological innovations and the increasing importance of real-time data insights. Recent developments include: July 2020: Google LLC, a technology company, launched BigQuery Omni, a multi-cloud analytics solution that enables enterprises to access and securely analyze the data across Amazon Web Services, Google Cloud, and Microsoft Azure., September 2020: TIBCO Software Inc., a leading enterprise data solution providing company TIBCO Hyperconverged Analytics. The Hyperconverged Analytics solution and services the company offers aid in combining data science, visual analytics, and streaming analytics to provide companies with expanded analytical strategies.. Key drivers for this market are: Increasing data volumes and the need for insights Growing adoption of cloud computing platforms Advances in AI and machine learning Demand for real-time analytics Enhanced data security and compliance requirements. Potential restraints include: Data privacy and security concerns Data integration complexities Lack of skilled professionals High implementation and maintenance costs Data center outages and downtime. Notable trends are: Hybrid cloud analytics models Predictive maintenance and prescriptive analytics Edge analytics and IoT integration Advanced data visualization techniques.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Mustafa Keser (2024). BigQuery Fintech Dataset [Dataset]. https://www.kaggle.com/datasets/mustafakeser4/bigquery-fintech-dataset
Organization logo

BigQuery Fintech Dataset

Comprehensive fintech data for loan and customer analysis.

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 17, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Mustafa Keser
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

Dataset: cloud-training-demos.fintech

This dataset, hosted on BigQuery, is designed for financial technology (fintech) training and analysis. It comprises six interconnected tables, each providing detailed insights into various aspects of customer loans, loan purposes, and regional distributions. The dataset is ideal for practicing SQL queries, building data models, and conducting financial analytics.

Tables:

  1. customer:
    Contains records of individual customers, including demographic details and unique customer IDs. This table serves as a primary reference for analyzing customer behavior and loan distribution.

  2. loan:
    Includes detailed information about each loan issued, such as the loan amount, interest rate, and tenure. The table is crucial for analyzing lending patterns and financial outcomes.

  3. loan_count_by_year:
    Provides aggregated loan data by year, offering insights into yearly lending trends. This table helps in understanding the temporal dynamics of loan issuance.

  4. loan_purposes:
    Lists various reasons or purposes for which loans were issued, along with corresponding loan counts. This data can be used to analyze customer needs and market demands.

  5. loan_with_region:
    Combines loan data with regional information, allowing for geographical analysis of lending activities. This table is key for regional market analysis and understanding how loan distribution varies across different areas.

  6. state_region:
    Maps state names to their respective regions, enabling a more granular geographical analysis when combined with other tables in the dataset.

Use Cases:

  • Customer Segmentation: Analyze customer data to identify distinct segments based on demographics and loan behaviors.
  • Loan Analysis: Explore loan issuance patterns, interest rates, and purposes to uncover trends and insights.
  • Regional Analysis: Combine loan and region data to understand how loan distributions vary by geography.
  • Temporal Trends: Utilize the loan_count_by_year table to observe how lending patterns evolve over time.

This dataset is ideal for those looking to enhance their skills in SQL, financial data analysis, and BigQuery, providing a comprehensive foundation for fintech-related projects and case studies.

Search
Clear search
Close search
Google apps
Main menu