Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The Google Merchandise Store sells Google branded merchandise. The data is typical of what you would see for an ecommerce website.
The sample dataset contains Google Analytics 360 data from the Google Merchandise Store, a real ecommerce store. The Google Merchandise Store sells Google branded merchandise. The data is typical of what you would see for an ecommerce website. It includes the following kinds of information:
Traffic source data: information about where website visitors originate. This includes data about organic traffic, paid search traffic, display traffic, etc. Content data: information about the behavior of users on the site. This includes the URLs of pages that visitors look at, how they interact with content, etc. Transactional data: information about the transactions that occur on the Google Merchandise Store website.
Fork this kernel to get started.
Banner Photo by Edho Pratama from Unsplash.
What is the total number of transactions generated per device browser in July 2017?
The real bounce rate is defined as the percentage of visits with a single pageview. What was the real bounce rate per traffic source?
What was the average number of product pageviews for users who made a purchase in July 2017?
What was the average number of product pageviews for users who did not make a purchase in July 2017?
What was the average total transactions per user that made a purchase in July 2017?
What is the average amount of money spent per session in July 2017?
What is the sequence of pages viewed?
Facebook
TwitterThis dataset is a custom reference of Google Analytics field definitions.
It was specifically compiled to enhance datasets like the Google Analytics 360 data from the Google Merchandise Store, which lacks field descriptions in its original BigQuery schema. By providing detailed definitions for each field, this reference aims to improve the interpretability of the dataโespecially when used by language models or analytics tools that rely on contextual understanding to process and answer queries effectively.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
The dataset provides 12 months (August 2016 to August 2017) of obfuscated Google Analytics 360 data from the Google Merchandise Store , a real ecommerce store that sells Google-branded merchandise, in BigQuery. Itโs a great way analyze business data and learn the benefits of using BigQuery to analyze Analytics 360 data Learn more about the data The data includes The data is typical of what an ecommerce website would see and includes the following information:Traffic source data: information about where website visitors originate, including data about organic traffic, paid search traffic, and display trafficContent data: information about the behavior of users on the site, such as URLs of pages that visitors look at, how they interact with content, etc. Transactional data: information about the transactions on the Google Merchandise Store website.Limitations: All users have view access to the dataset. This means you can query the dataset and generate reports but you cannot complete administrative tasks. Data for some fields is obfuscated such as fullVisitorId, or removed such as clientId, adWordsClickInfo and geoNetwork. โNot available in demo datasetโ will be returned for STRING values and โnullโ will be returned for INTEGER values when querying the fields containing no data.This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery
Facebook
TwitterCSV version of Looker Ecommerce Dataset.
Overview Dataset in BigQuery TheLook is a fictitious eCommerce clothing site developed by the Looker team. The dataset contains information >about customers, products, orders, logistics, web events and digital marketing campaigns. The contents of this >dataset are synthetic, and are provided to industry practitioners for the purpose of product discovery, testing, and >evaluation. This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This >means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on >this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public >datasets.
distribution_centers.csvid: Unique identifier for each distribution center.name: Name of the distribution center.latitude: Latitude coordinate of the distribution center.longitude: Longitude coordinate of the distribution center.events.csvid: Unique identifier for each event.user_id: Identifier for the user associated with the event.sequence_number: Sequence number of the event.session_id: Identifier for the session during which the event occurred.created_at: Timestamp indicating when the event took place.ip_address: IP address from which the event originated.city: City where the event occurred.state: State where the event occurred.postal_code: Postal code of the event location.browser: Web browser used during the event.traffic_source: Source of the traffic leading to the event.uri: Uniform Resource Identifier associated with the event.event_type: Type of event recorded.inventory_items.csvid: Unique identifier for each inventory item.product_id: Identifier for the associated product.created_at: Timestamp indicating when the inventory item was created.sold_at: Timestamp indicating when the item was sold.cost: Cost of the inventory item.product_category: Category of the associated product.product_name: Name of the associated product.product_brand: Brand of the associated product.product_retail_price: Retail price of the associated product.product_department: Department to which the product belongs.product_sku: Stock Keeping Unit (SKU) of the product.product_distribution_center_id: Identifier for the distribution center associated with the product.order_items.csvid: Unique identifier for each order item.order_id: Identifier for the associated order.user_id: Identifier for the user who placed the order.product_id: Identifier for the associated product.inventory_item_id: Identifier for the associated inventory item.status: Status of the order item.created_at: Timestamp indicating when the order item was created.shipped_at: Timestamp indicating when the order item was shipped.delivered_at: Timestamp indicating when the order item was delivered.returned_at: Timestamp indicating when the order item was returned.orders.csvorder_id: Unique identifier for each order.user_id: Identifier for the user who placed the order.status: Status of the order.gender: Gender information of the user.created_at: Timestamp indicating when the order was created.returned_at: Timestamp indicating when the order was returned.shipped_at: Timestamp indicating when the order was shipped.delivered_at: Timestamp indicating when the order was delivered.num_of_item: Number of items in the order.products.csvid: Unique identifier for each product.cost: Cost of the product.category: Category to which the product belongs.name: Name of the product.brand: Brand of the product.retail_price: Retail price of the product.department: Department to which the product belongs.sku: Stock Keeping Unit (SKU) of the product.distribution_center_id: Identifier for the distribution center associated with the product.users.csvid: Unique identifier for each user.first_name: First name of the user.last_name: Last name of the user.email: Email address of the user.age: Age of the user.gender: Gender of the user.state: State where t...
Facebook
TwitterMultiversX is a highly scalable, secure and decentralized blockchain network created to enable radically new applications, for users, businesses, society, and the new metaverse frontier. This dataset is one of many crypto datasets that are available within Google Cloud Public Datasets . As with other Google Cloud public datasets, you can query this dataset for free, up to 1TB/month of free processing, every month. Watch this short video to learn how to get started with the public datasets.
Facebook
TwitterThis dataset provides a curated subset of the anonymized Google Analytics event data for three months of the Google Merchandise Store. The full dataset is available as a BigQuery Public Dataset.
The data includes information on items sold in the store and how much money was spent by users over time. It is both comprehensive enough to invite real analysis yet simple enough to facilitate teaching.
Foto von Arthur Osipyan auf Unsplash
Facebook
Twitterhttps://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Discover the booming Analytical Data Store Tools market! This comprehensive analysis reveals a $50 billion market in 2025, projected to reach $150 billion by 2033 at a 15% CAGR. Learn about key drivers, trends, and top players like Snowflake, Google, and Microsoft, and gain insights into regional market shares.
Facebook
Twitterhttps://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The Cloud Data Warehouse (CDW) solutions market is experiencing robust growth, driven by the increasing need for scalable, cost-effective, and secure data storage and analytics solutions across various industries. The market's expansion is fueled by several factors, including the proliferation of big data, the rise of cloud computing adoption, and the growing demand for real-time business intelligence. Organizations are migrating from on-premise data warehouses to cloud-based solutions to leverage the benefits of scalability, elasticity, and pay-as-you-go pricing models. This shift is further accelerated by the increasing complexity of data management and the need for advanced analytics capabilities to gain actionable insights from vast datasets. Competition is fierce, with major players like Amazon Redshift, Snowflake, Google Cloud, and Microsoft Azure Synapse leading the market, each offering unique strengths and capabilities. However, the market also witnesses the emergence of niche players catering to specific industry needs or geographical regions. The overall market is segmented based on deployment models (public, private, hybrid), service models (SaaS, PaaS, IaaS), and industry verticals (finance, healthcare, retail, etc.). Future growth will likely be influenced by advancements in technologies such as AI, machine learning, and serverless computing, further enhancing the analytical capabilities of CDW solutions. The projected Compound Annual Growth Rate (CAGR) suggests a substantial increase in market value over the forecast period (2025-2033). Assuming a conservative CAGR of 15% (a reasonable estimate considering the rapid technological advancements in this space), and a 2025 market size of $50 billion (a reasonable estimate based on industry reports), the market is poised for significant expansion. This growth will be influenced by factors such as increasing data volumes, advancements in data analytics techniques, and the growing adoption of cloud-based technologies by small and medium-sized businesses (SMBs). Despite the rapid growth, challenges remain, including data security concerns, integration complexities, and vendor lock-in. However, continuous innovation and the development of robust security measures will mitigate these challenges, paving the way for sustained market growth in the coming years.
Facebook
TwitterThis dataset surfaces data from the Ethereum blockchain and includes tables for blocks, transactions, logs, and more. Ethereum is a decentralized open-source blockchain system that features its own cryptocurrency, Ether. A blockchain is an ever-growing tree of blocks. Each block contains a number of transactions. For more information, see the Blockchain Analytics documentation .
Facebook
Twitterhttps://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy
The Data Warehousing market is booming, projected to reach $88.4 billion by 2033 with a 13.64% CAGR. Explore key trends, leading companies like Snowflake & Databricks, and regional insights in this comprehensive market analysis. Discover how cloud-based solutions, big data analytics, and increasing data volumes are driving growth.
Facebook
Twitterhttps://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The Analytics Query Accelerator (AQA) market is experiencing robust growth, driven by the increasing demand for real-time insights from massive datasets across various industries. The market, estimated at $15 billion in 2025, is projected to achieve a Compound Annual Growth Rate (CAGR) of 20% from 2025 to 2033, reaching an estimated $70 billion by 2033. This expansion is fueled by several key factors. Firstly, the proliferation of big data and the need for rapid data analysis across sectors like finance, healthcare, and e-commerce are creating significant demand. Secondly, advancements in cloud computing and distributed database technologies are enabling faster query processing and improved performance of AQAs. Finally, the rising adoption of advanced analytics techniques such as machine learning and artificial intelligence is further driving the need for efficient query acceleration solutions. Key players like Google, Amazon, Snowflake, Microsoft, Databricks, Teradata, and Cloudera are actively competing in this rapidly evolving landscape, investing heavily in R&D and strategic partnerships to maintain market leadership. The growth trajectory of the AQA market is further shaped by emerging trends such as the increasing adoption of serverless computing and the expansion of edge analytics. However, challenges remain, including the complexity of implementing and managing AQA solutions, the need for skilled professionals, and concerns related to data security and privacy. Despite these restraints, the long-term outlook for the AQA market remains exceptionally positive, fueled by continuous technological innovations and the ever-increasing reliance on data-driven decision-making across all industries. The market segmentation is likely diversified across various deployment models (cloud, on-premise), data types (structured, unstructured), and industry verticals. This diverse landscape presents numerous opportunities for both established players and emerging companies to capture market share.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains Ethereum transaction data from September 1, 2022 to September 11, 2022.
This data is downloaded directly from Google BigQuery - Ethereum, a public dataset that provides comprehensive blockchain data.
Free and Open Source: This dataset is released under the CC0-1.0 (Creative Commons Zero) license, meaning it is in the public domain and free to use for any purpose without restrictions.
The dataset includes detailed transaction information from the Ethereum blockchain during the specified time period, including transaction hashes, addresses, values, gas fees, and timestamps.
ethtx_220901_220911_*: Ethereum transaction data files split into 41 shards (~9.6GB total)
Facebook
TwitterSui is a Layer 1 blockchain which aims to overcome blockchain limitations of slow speeds, high costs, and complex onboarding to make Web3 accessible and efficient for a wide range of users. Sui is built by Mysten labs, a blockchain infrastructure company founded by four ex-Meta engineers who worked on the Diem blockchain project. Sui leverages the Move programming language for smart contract development, offering resource safety and formal verification for secure development. Data freshness can range between minutes to hours depending on chain activity and transaction volumes. Questions? Please reach out to cloud-blockchain-analytics-help@google.com
Facebook
TwitterThis dataset surfaces data from the Fantom blockchain and includes tables for blocks, transactions, logs, and more. Fantom is a decentralized, blockchain-based operating system with smart contract functionality, proof-of-stake principles as its consensus algorithm and a cryptocurrency native to the system, known as FTM. A blockchain is an ever-growing tree of blocks. Each block contains a number of transactions. For more information, see the Blockchain Analytics documentation . This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery .
Facebook
TwitterThis dataset surfaces data from the Polygon blockchain and includes tables for blocks, transactions, logs, and more. Polygon Technology is Ethereum's Internet of Blockchain. Polygon uses Proof of Stake technology and it is a zero-knowledge technology Polygon is a Layer-2 chain that settles to Ethereum's Layer 1 (L1) chain. Polygon's goal is to offer faster and cheaper transactions on Ethereum by using sidechains, which are stand-alone blockchains that run alongside the Ethereum mainnet For more information, see the Blockchain Analytics documentation . This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery .
Facebook
Twitterhttps://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The cloud data warehouse market is experiencing robust growth, driven by the increasing need for scalable, cost-effective, and readily accessible data analytics solutions. The market's expansion is fueled by several key factors, including the burgeoning adoption of cloud computing across various industries, the proliferation of big data, and the growing demand for real-time business intelligence. Organizations are migrating from on-premise data warehouses to cloud-based solutions to leverage enhanced scalability, reduced infrastructure costs, and improved agility. This shift is further accelerated by the availability of advanced analytics tools and services within the cloud ecosystem, enabling businesses to derive actionable insights from their data more efficiently. Competitive pressures and the need to gain a competitive edge are also significant drivers, pushing enterprises to adopt sophisticated data warehousing solutions capable of handling complex analytical workloads. The market is highly fragmented, with major players such as Amazon, Google, Microsoft, and others competing intensely through innovation, strategic partnerships, and aggressive pricing strategies. While the market shows significant promise, certain challenges persist. Data security and privacy concerns remain a major obstacle to wider adoption, particularly in regulated industries. Integration complexities with existing on-premise systems and the need for skilled professionals to manage and maintain cloud data warehouses also present hurdles. However, ongoing technological advancements in areas such as data encryption, access control, and automated data integration are mitigating these challenges. Furthermore, the emergence of new technologies, such as serverless architectures and AI-powered analytics, is continuously reshaping the market landscape, fostering innovation and expanding the market's potential. Over the forecast period (2025-2033), consistent growth is anticipated, fueled by ongoing digital transformation initiatives across various sectors. We estimate a conservative CAGR (considering industry averages for similar tech sectors) of 15% over this period, indicating substantial growth opportunities.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This curated dataset consists of 269,353 patent documents (published patent applications and granted patents) spanning the 1976 to 2016 period and is intended to help identify promising R&D on the horizon in diagnostics, therapeutics, data analytics, and model biological systems.
USPTO Cancer Moonshot Patent Data was generated using USPTO examiner tools to execute a series of queries designed to identify cancer-specific patents and patent applications. This includes drugs, diagnostics, cell lines, mouse models, radiation-based devices, surgical devices, image analytics, data analytics, and genomic-based inventions.
โUSPTO Cancer Moonshot Patent Dataโ by the USPTO, for public use. Frumkin, Jesse and Myers, Amanda F., Cancer Moonshot Patent Data (August, 2016).
Data Origin: https://bigquery.cloud.google.com/dataset/patents-public-data:uspto_oce_cancer
Facebook
TwitterThis dataset surfaces data from the Avalanche blockchain and includes tables for blocks, transactions, logs, and more. Avalanche is a decentralized, open-source proof of stake blockchain with smart contract functionality. AVAX is the native cryptocurrency of the platform. A blockchain is an ever-growing tree of blocks. Each block contains a number of transactions. For more information, see the Blockchain Analytics documentation . This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery .
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Bitcoin and other cryptocurrencies have captured the imagination of technologists, financiers, and economists. Digital currencies are only one application of the underlying blockchain technology. Like its predecessor, Bitcoin, the Ethereum blockchain can be described as an immutable distributed ledger. However, creator Vitalik Buterin also extended the set of capabilities by including a virtual machine that can execute arbitrary code stored on the blockchain as smart contracts.
Both Bitcoin and Ethereum are essentially OLTP databases, and provide little in the way of OLAP (analytics) functionality. However the Ethereum dataset is notably distinct from the Bitcoin dataset:
The Ethereum blockchain has as its primary unit of value Ether, while the Bitcoin blockchain has Bitcoin. However, the majority of value transfer on the Ethereum blockchain is composed of so-called tokens. Tokens are created and managed by smart contracts.
Ether value transfers are precise and direct, resembling accounting ledger debits and credits. This is in contrast to the Bitcoin value transfer mechanism, for which it can be difficult to determine the balance of a given wallet address.
Addresses can be not only wallets that hold balances, but can also contain smart contract bytecode that allows the programmatic creation of agreements and automatic triggering of their execution. An aggregate of coordinated smart contracts could be used to build a decentralized autonomous organization.
The Ethereum blockchain data are now available for exploration with BigQuery. All historical data are in the ethereum_blockchain dataset, which updates daily.
Our hope is that by making the data on public blockchain systems more readily available it promotes technological innovation and increases societal benefits.
You can use the BigQuery Python client library to query tables in this dataset in Kernels. Note that methods available in Kernels are limited to querying data. Tables are at bigquery-public-data.crypto_ethereum.[TABLENAME]. Fork this kernel to get started.
Cover photo by Thought Catalog on Unsplash
Facebook
Twitterhttps://www.promarketreports.com/privacy-policyhttps://www.promarketreports.com/privacy-policy
The size of the Cloud Analytics Market was valued at USD 23.82 billion in 2023 and is projected to reach USD 82.22 billion by 2032, with an expected CAGR of 19.36% during the forecast period. The cloud analytics market has witnessed significant growth, driven by the rising demand for data-driven decision-making and the increasing adoption of cloud computing technologies. Organizations are leveraging cloud analytics to process and analyze vast amounts of structured and unstructured data, enabling them to gain actionable insights and improve operational efficiency. The market's expansion is fueled by the scalability, cost-effectiveness, and real-time capabilities of cloud-based solutions compared to traditional on-premises systems. Industries such as retail, healthcare, banking, and IT are increasingly integrating cloud analytics into their operations to enhance customer experiences, optimize supply chains, and mitigate risks. Furthermore, advancements in artificial intelligence and machine learning are augmenting the analytical capabilities of cloud platforms, allowing businesses to forecast trends and automate complex processes. The growing popularity of hybrid and multi-cloud environments is also contributing to the market's growth by offering flexibility and addressing data security concerns. As organizations continue to prioritize digital transformation and data utilization, the cloud analytics market is poised for sustained expansion, driven by technological innovations and the increasing importance of real-time data insights. Recent developments include: July 2020: Google LLC, a technology company, launched BigQuery Omni, a multi-cloud analytics solution that enables enterprises to access and securely analyze the data across Amazon Web Services, Google Cloud, and Microsoft Azure., September 2020: TIBCO Software Inc., a leading enterprise data solution providing company TIBCO Hyperconverged Analytics. The Hyperconverged Analytics solution and services the company offers aid in combining data science, visual analytics, and streaming analytics to provide companies with expanded analytical strategies.. Key drivers for this market are: Increasing data volumes and the need for insights Growing adoption of cloud computing platforms Advances in AI and machine learning Demand for real-time analytics Enhanced data security and compliance requirements. Potential restraints include: Data privacy and security concerns Data integration complexities Lack of skilled professionals High implementation and maintenance costs Data center outages and downtime. Notable trends are: Hybrid cloud analytics models Predictive maintenance and prescriptive analytics Edge analytics and IoT integration Advanced data visualization techniques.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The Google Merchandise Store sells Google branded merchandise. The data is typical of what you would see for an ecommerce website.
The sample dataset contains Google Analytics 360 data from the Google Merchandise Store, a real ecommerce store. The Google Merchandise Store sells Google branded merchandise. The data is typical of what you would see for an ecommerce website. It includes the following kinds of information:
Traffic source data: information about where website visitors originate. This includes data about organic traffic, paid search traffic, display traffic, etc. Content data: information about the behavior of users on the site. This includes the URLs of pages that visitors look at, how they interact with content, etc. Transactional data: information about the transactions that occur on the Google Merchandise Store website.
Fork this kernel to get started.
Banner Photo by Edho Pratama from Unsplash.
What is the total number of transactions generated per device browser in July 2017?
The real bounce rate is defined as the percentage of visits with a single pageview. What was the real bounce rate per traffic source?
What was the average number of product pageviews for users who made a purchase in July 2017?
What was the average number of product pageviews for users who did not make a purchase in July 2017?
What was the average total transactions per user that made a purchase in July 2017?
What is the average amount of money spent per session in July 2017?
What is the sequence of pages viewed?