Facebook
TwitterCSV version of Looker Ecommerce Dataset.
Overview Dataset in BigQuery TheLook is a fictitious eCommerce clothing site developed by the Looker team. The dataset contains information >about customers, products, orders, logistics, web events and digital marketing campaigns. The contents of this >dataset are synthetic, and are provided to industry practitioners for the purpose of product discovery, testing, and >evaluation. This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This >means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on >this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public >datasets.
distribution_centers.csvid: Unique identifier for each distribution center.name: Name of the distribution center.latitude: Latitude coordinate of the distribution center.longitude: Longitude coordinate of the distribution center.events.csvid: Unique identifier for each event.user_id: Identifier for the user associated with the event.sequence_number: Sequence number of the event.session_id: Identifier for the session during which the event occurred.created_at: Timestamp indicating when the event took place.ip_address: IP address from which the event originated.city: City where the event occurred.state: State where the event occurred.postal_code: Postal code of the event location.browser: Web browser used during the event.traffic_source: Source of the traffic leading to the event.uri: Uniform Resource Identifier associated with the event.event_type: Type of event recorded.inventory_items.csvid: Unique identifier for each inventory item.product_id: Identifier for the associated product.created_at: Timestamp indicating when the inventory item was created.sold_at: Timestamp indicating when the item was sold.cost: Cost of the inventory item.product_category: Category of the associated product.product_name: Name of the associated product.product_brand: Brand of the associated product.product_retail_price: Retail price of the associated product.product_department: Department to which the product belongs.product_sku: Stock Keeping Unit (SKU) of the product.product_distribution_center_id: Identifier for the distribution center associated with the product.order_items.csvid: Unique identifier for each order item.order_id: Identifier for the associated order.user_id: Identifier for the user who placed the order.product_id: Identifier for the associated product.inventory_item_id: Identifier for the associated inventory item.status: Status of the order item.created_at: Timestamp indicating when the order item was created.shipped_at: Timestamp indicating when the order item was shipped.delivered_at: Timestamp indicating when the order item was delivered.returned_at: Timestamp indicating when the order item was returned.orders.csvorder_id: Unique identifier for each order.user_id: Identifier for the user who placed the order.status: Status of the order.gender: Gender information of the user.created_at: Timestamp indicating when the order was created.returned_at: Timestamp indicating when the order was returned.shipped_at: Timestamp indicating when the order was shipped.delivered_at: Timestamp indicating when the order was delivered.num_of_item: Number of items in the order.products.csvid: Unique identifier for each product.cost: Cost of the product.category: Category to which the product belongs.name: Name of the product.brand: Brand of the product.retail_price: Retail price of the product.department: Department to which the product belongs.sku: Stock Keeping Unit (SKU) of the product.distribution_center_id: Identifier for the distribution center associated with the product.users.csvid: Unique identifier for each user.first_name: First name of the user.last_name: Last name of the user.email: Email address of the user.age: Age of the user.gender: Gender of the user.state: State where t...
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Presentation for a hands-on training session designed to help participants learn or refine their skills in analysing OpenAIRE Graph data from the Google Cloud with Biq Query. The workshop lasted 4 hours and alternated between presentations and hands-on practice with guidance from trainers. The training covered: Introduction to Google Cloud and Big Query Introduction to the OpenAIRE Graph on BigQuery Gentle introduction to SQL Simple queries walkthrough and exercises Advanced queries (e.g., with JOINS and Big Query functions) walkthrough and exercises Data takeout + Python notebooks on Google BigQuery
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Certainly! Here's a description for the Kaggle dataset related to the cloud-training-demos.SAP_REPLICATED_DATA BigQuery public dataset:
Dataset ID: cloud-training-demos.SAP_REPLICATED_DATA
Overview:
The SAP_REPLICATED_DATA dataset in BigQuery provides a comprehensive replication of SAP (Systems, Applications, and Products in Data Processing) business data. This dataset is designed to support data analytics and machine learning tasks by offering a rich set of structured data that mimics real-world enterprise scenarios. It includes data from various SAP modules and processes, enabling users to perform in-depth analysis, build predictive models, and explore business insights.
Content: - Tables and Schemas: The dataset consists of multiple tables representing different aspects of SAP business operations, including but not limited to sales, inventory, finance, and procurement data. - Data Types: It contains structured data with fields such as transaction IDs, timestamps, customer details, product information, sales figures, and financial metrics. - Data Volume: The dataset is designed to simulate large-scale enterprise data, making it suitable for performance testing, data processing, and analysis.
Usage: - Business Analytics: Users can analyze business trends, sales performance, and financial metrics. - Machine Learning: Ideal for developing and testing machine learning models related to business forecasting, anomaly detection, and customer segmentation. - Data Processing: Suitable for practicing SQL queries, data transformation, and integration tasks.
Example Use Cases: - Sales Analysis: Track and analyze sales performance across different regions and time periods. - Inventory Management: Monitor inventory levels and identify trends in stock movements. - Financial Reporting: Generate financial reports and analyze expense patterns.
For more information and to access the dataset, visit the BigQuery public datasets page or refer to the dataset documentation in the BigQuery console.
Here's a Markdown table with the information you provided:
| File Name | Description |
|---|---|
| adr6.csv | Addresses with organizational units. Contains address details related to organizational units like departments or branches. |
| adrc.csv | General Address Data. Provides information about addresses, including details such as street, city, and postal codes. |
| adrct.csv | Address Contact Information. Contains contact information linked to addresses, including phone numbers and email addresses. |
| adrt.csv | Address Details. Includes detailed address data such as street addresses, city, and country codes. |
| ankt.csv | Accounting Document Segment. Provides details on segments within accounting documents, including account numbers and amounts. |
| anla.csv | Asset Master Data. Contains information about fixed assets, including asset identification and classification. |
| bkpf.csv | Accounting Document Header. Contains headers of accounting documents, such as document numbers and fiscal year. |
| bseg.csv | Accounting Document Segment. Details line items within accounting documents, including account details and amounts. |
| but000.csv | Business Partners. Contains basic information about business partners, including IDs and names. |
| but020.csv | Business Partner Addresses. Provides address details associated with business partners. |
| cepc.csv | Customer Master Data - Central. Contains centralized data for customer master records. |
| cepct.csv | Customer Master Data - Contact. Provides contact details associated with customer records. |
| csks.csv | Cost Center Master Data. Contains data about cost centers within the organization. |
| cskt.csv | Cost Center Texts. Provides text descriptions and labels for cost centers. |
| dd03l.csv | Data Element Field Labels. Contains labels and descriptions for data fields in the SAP system. |
| ekbe.csv | Purchase Order History. Details history of purchase orders, including quantities and values. |
| ekes.csv | Purchasing Document History. Contains history of purchasing documents including changes and statuses. |
| eket.csv | Purchase Order Item History. Details changes and statuses for individual purchase order items. |
| ekkn.csv | Purchase Order Account Assignment. Provides account assignment details for purchas... |
Facebook
TwitterReddit posts, 2019-01-01 thru 2019-08-01.
Source: https://console.cloud.google.com/bigquery?p=fh-bigquery&page=project
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
NYC Open Data is an opportunity to engage New Yorkers in the information that is produced and used by City government. We believe that every New Yorker can benefit from Open Data, and Open Data can benefit from every New Yorker. Source: https://opendata.cityofnewyork.us/overview/
Thanks to NYC Open Data, which makes public data generated by city agencies available for public use, and Citi Bike, we've incorporated over 150 GB of data in 5 open datasets into Google BigQuery Public Datasets, including:
Over 8 million 311 service requests from 2012-2016
More than 1 million motor vehicle collisions 2012-present
Citi Bike stations and 30 million Citi Bike trips 2013-present
Over 1 billion Yellow and Green Taxi rides from 2009-present
Over 500,000 sidewalk trees surveyed decennially in 1995, 2005, and 2015
This dataset is deprecated and not being updated.
Fork this kernel to get started with this dataset.
https://opendata.cityofnewyork.us/
This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - https://data.cityofnewyork.us/ - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.
By accessing datasets and feeds available through NYC Open Data, the user agrees to all of the Terms of Use of NYC.gov as well as the Privacy Policy for NYC.gov. The user also agrees to any additional terms of use defined by the agencies, bureaus, and offices providing data. Public data sets made available on NYC Open Data are provided for informational purposes. The City does not warranty the completeness, accuracy, content, or fitness for any particular purpose or use of any public data set made available on NYC Open Data, nor are any such warranties to be implied or inferred with respect to the public data sets furnished therein.
The City is not liable for any deficiencies in the completeness, accuracy, content, or fitness for any particular purpose or use of any public data set, or application utilizing such data set, provided by any third party.
Banner Photo by @bicadmedia from Unplash.
On which New York City streets are you most likely to find a loud party?
Can you find the Virginia Pines in New York City?
Where was the only collision caused by an animal that injured a cyclist?
What’s the Citi Bike record for the Longest Distance in the Shortest Time (on a route with at least 100 rides)?
https://cloud.google.com/blog/big-data/2017/01/images/148467900588042/nyc-dataset-6.png" alt="enter image description here">
https://cloud.google.com/blog/big-data/2017/01/images/148467900588042/nyc-dataset-6.png
Facebook
Twitterhttps://www.usa.gov/government-works/https://www.usa.gov/government-works/
Chicago is one of America's most iconic cities. It has a colorful history, which rich histories such. Recently, Chicago was also a setting for one of Netflix's popular series : Ozark. The story has it that Chicago is the center for drug distribution for the Navarro cartel.
So, how true is the series? A quick search on the internet reveals a recently released DEA report on the. The report shows that drug crime exists in Chicago, although they are distributed by the Cartel de Jalisco Nueva Generacion, the Sinaloa Cartel and the Guerros Unidos, to name a few.
The government of the City of Chicago has provided a publicly available crime database accessible via Google BigQuery. I have downloaded a subset of the data with crime_type narcotics and year > 2015. The data contains records between 1 Jan 2016 UTC until 23 Jul 2020 UTC.
The dataset contains these columns :
- case_number : ID of the record
- date : Date of incident
- iucr : Category of the crime, per Illinois Unified Crime Reporting (IUCR) code. [more](https://data.cityofchicago.org/widgets/c7ck-438e)
-description: More detailed description of the crime
-location_description: Location of the crime
-arrest: Whether an arrest was made
-domestic: Was the crime domestic?
-district: Which district code where the crime happened. [more](https://data.cityofchicago.org/Public-Safety/Boundaries-Police-Districts-current-/fthy-xz3r)
-ward: The ward code where the crime happened. [more](https://data.cityofchicago.org/Facilities-Geographic-Boundaries/Boundaries-Wards-2015-/sp34-6z76)
-community_area` : The community area code where the crime happened. more
The data is owned and kindly provided by the City of Chicago.
Some questions to get you started:
Lastly, if you are : - a newly recruited analyst at the DEA / police, what would you recommend? - asked by el jefe del cartel (boss of the cartel) on how to expand operation / operate better, what would you say?
Happy wrangling!
Facebook
TwitterThe table posts is part of the dataset Reddit, available at https://redivis.com/datasets/prpw-49sqq9ehv. It contains 150795895 rows across 33 variables.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
From an end-user perspective, here is a brief overview of the purpose of each transaction type:
Pay-to-Public-Key-Hash (P2PKH) transaction: This is the most common type of transaction in Bitcoin, where the sender sends bitcoin to the recipient's Bitcoin address. P2PKH transactions are used for everyday transactions, such as buying goods or services.
Pay-to-Script-Hash (P2SH) transaction: This type of transaction allows for more complex scripts to be used as the receiving address. P2SH transactions are used to enable advanced scripting features, such as multi-sig transactions and time-locked transactions.
Multi-Signature (Multi-Sig) transaction: This type of transaction requires multiple signatures to authorize a transaction, making it more secure. Multi-sig transactions are used in situations where multiple parties need to approve a transaction, such as for joint accounts or high-value transactions.
Segregated Witness (SegWit) transaction: This is a type of transaction that separates transaction signature data from the transaction data, reducing the size of the transaction and increasing transaction capacity. SegWit transactions are used to reduce fees and improve transaction speed.
Lightning Network transaction: This is a layer 2 scaling solution that allows for instant and low-cost transactions by opening a payment channel between two parties. Lightning Network transactions are used for frequent and small-value transactions, such as micropayments and instant payments.
It's worth noting that this list may not cover every possible transaction type in the Bitcoin network, since there may be variations or new types of output scripts that are not yet recognized or categorized by the outputs.script_type field. Additionally, some complex transactions may use multiple output scripts of different types, which can complicate their categorization.
The distribution of transaction types in the Bitcoin/Blockchain ecosystem can vary depending on the period analyzed and the specific data source used. However, here is a general overview of the distribution of transaction types in Bitcoin:
Regular transactions (Pay-to-Public-Key-Hash or P2PKH transactions) are the most common type of transaction in the Bitcoin network. In some periods, regular transactions account for over 95% of all transactions in the network.
Pay-to-Script-Hash (P2SH) transactions are the second most common type of transaction, accounting for around 3-4% of transactions.
Multi-Signature (Multi-Sig) transactions, Segregated Witness (SegWit) transactions, and Lightning Network transactions together account for less than 1% of all transactions in the Bitcoin network.
It's important to note that the distribution of transaction types can change over time as the Bitcoin network evolves and new features and technologies are introduced. Also, the distribution of transaction types can vary across different blockchain networks other than Bitcoin.
Facebook
TwitterThe table subreddits is part of the dataset Reddit, available at https://redivis.com/datasets/prpw-49sqq9ehv. It contains 2499 rows across 7 variables.
Facebook
TwitterAssume you are a data analyst in an EdTech company. The company’s customer success team works with an objective to help customers get the maximum value from their product by doing deeper dives into the customer's needs, wants and expectations from the product and helping them reach their goals.
The customer success team is aiming to achieve sustainable growth by focusing on retaining the existing users.
Therefore, your team wants to analyze the activity of your existing users and understand their performance, behaviours, and patterns to gain meaningful insights, that help your customer success team take data-informed decisions.
Your recommendations must be backed by meaningful insights and professional visualizations which will help your customer success team design road maps, strategies, and action items to achieve the goal.
The dataset contains the basic details of the enrolled users, their learning resource completion percentages, activities on the platform and the structure of learning resources available on the platform
1.**users_basic_details**: Contains basic details of the enrolled users.
2.**day_wise_user_activity**: Contains the details of the day-wise learning activity of the users.
- A user shall have one entry for a lesson in a day.
3.**learning_resource_details**: Contains the details of learning resources offered to the enrolled users
- Content is stored in a hierarchical structure: Track → Course →Topic → Lesson. A lesson can be a video, practice, exam, etc.
- Example: Tech Foundations → Developer Foundations → Topic 1 → lesson 1
4.**feedback_details**: Contains the feedback details/rating given by the user to a particular lesson.
- Feedback rating is given on a scale of 1 to 5, 5 being the highest.
- A user can give feedback to the same lesson multiple times.
5.**discussion_details**: Contains the details of the discussions created by the user for a particular lesson.
6.**discussion_comment_details**: Contains the details of the comments posted for the discussions created by the user.
- Comments may be posted by mentors or users themselves.
- The role of mentors is to guide and help the users by resolving the doubts and issues faced by them related to their learning activity.
- A discussion can have multiple comments.
users_basic_details:
user_id: unique id of the user [string]gender: gender of the enrolled user [string]current_city: city of residence of the user [string]batch_start_datetime: start datetime of the batch, for which the user is enrolled [datetime]referral_source: referral channel of the user [string]highest_qualification: highest qualification (education details) of the enrolled user [string]day_wise_user_activity:
activity_datetime: date and time of learning of the user [datetime]user_id: unique id of the user [string]lesson_id: unique id of the lesson [string]lesson_type: type of the lesson. It can be "SESSION", "PRACTICE", "EXAM" or "PROJECT" [string]day_completion_percentage: percent of the lesson completed by the user on a particular day (out of 100%) [float]
overall_completion_percentage: overall completion percentage of the lesson till date by the user (out of 100%) [float]
day_completion_percentage - 10%, overall_completion_percentage - 10%day_completion_percentage - 35%, overall_completion_percentage - 45%day_completion_percentage - 37%, overall_completion_percentage - 82%day_completion_percentage - 18%, overall_completion_percentage - 100%learning_resource_details:
track_id: unique id of the track [string]track_title: name of the track [string]course_id: unique id of the course [string] Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterCSV version of Looker Ecommerce Dataset.
Overview Dataset in BigQuery TheLook is a fictitious eCommerce clothing site developed by the Looker team. The dataset contains information >about customers, products, orders, logistics, web events and digital marketing campaigns. The contents of this >dataset are synthetic, and are provided to industry practitioners for the purpose of product discovery, testing, and >evaluation. This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This >means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on >this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public >datasets.
distribution_centers.csvid: Unique identifier for each distribution center.name: Name of the distribution center.latitude: Latitude coordinate of the distribution center.longitude: Longitude coordinate of the distribution center.events.csvid: Unique identifier for each event.user_id: Identifier for the user associated with the event.sequence_number: Sequence number of the event.session_id: Identifier for the session during which the event occurred.created_at: Timestamp indicating when the event took place.ip_address: IP address from which the event originated.city: City where the event occurred.state: State where the event occurred.postal_code: Postal code of the event location.browser: Web browser used during the event.traffic_source: Source of the traffic leading to the event.uri: Uniform Resource Identifier associated with the event.event_type: Type of event recorded.inventory_items.csvid: Unique identifier for each inventory item.product_id: Identifier for the associated product.created_at: Timestamp indicating when the inventory item was created.sold_at: Timestamp indicating when the item was sold.cost: Cost of the inventory item.product_category: Category of the associated product.product_name: Name of the associated product.product_brand: Brand of the associated product.product_retail_price: Retail price of the associated product.product_department: Department to which the product belongs.product_sku: Stock Keeping Unit (SKU) of the product.product_distribution_center_id: Identifier for the distribution center associated with the product.order_items.csvid: Unique identifier for each order item.order_id: Identifier for the associated order.user_id: Identifier for the user who placed the order.product_id: Identifier for the associated product.inventory_item_id: Identifier for the associated inventory item.status: Status of the order item.created_at: Timestamp indicating when the order item was created.shipped_at: Timestamp indicating when the order item was shipped.delivered_at: Timestamp indicating when the order item was delivered.returned_at: Timestamp indicating when the order item was returned.orders.csvorder_id: Unique identifier for each order.user_id: Identifier for the user who placed the order.status: Status of the order.gender: Gender information of the user.created_at: Timestamp indicating when the order was created.returned_at: Timestamp indicating when the order was returned.shipped_at: Timestamp indicating when the order was shipped.delivered_at: Timestamp indicating when the order was delivered.num_of_item: Number of items in the order.products.csvid: Unique identifier for each product.cost: Cost of the product.category: Category to which the product belongs.name: Name of the product.brand: Brand of the product.retail_price: Retail price of the product.department: Department to which the product belongs.sku: Stock Keeping Unit (SKU) of the product.distribution_center_id: Identifier for the distribution center associated with the product.users.csvid: Unique identifier for each user.first_name: First name of the user.last_name: Last name of the user.email: Email address of the user.age: Age of the user.gender: Gender of the user.state: State where t...