4 datasets found

FCC Political Ads
console.cloud.google.com
Updated Aug 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
https://console.cloud.google.com/marketplace/browse?filter=partner:Federal%20Communications%20Commission&hl=ko (2023). FCC Political Ads [Dataset]. https://console.cloud.google.com/marketplace/product/federal-communications-commission/fcc-political-ads?hl=ko
Explore at:
Dataset updated
Aug 22, 2023
Dataset provided by
Googlehttp://google.com/
Description
The FCC political ads public inspection files dataset contains political ad file information that broadcast stations have uploaded to their public inspection files, which are housed on the FCC website. This data includes all political ad files that have been provided by TV and radio broadcast stations, which dates back to 2012 when the FCC started requiring digital uploads of files to its website. Broadcasters are required to maintain this data in their public inspection files for two years, after which the stations are permitted to remove them from the FCC website. This information is uploaded to the FCC’s website in PDF form and not machine-readable. However, this dataset includes a content_info table that contains manual annotations of some data fields like advertiser, gross spend, ad air dates and a link to a copy of the PDF, which can be found on Google Cloud Storage. The manual annotations, which are included only for a subset of the PDFs, come from either ProPublica’s Free the Files effort or from Google and are an experimental dataset. This dataset is a work in progress, with additional PDFs continually annotated. All tables in this dataset are updated monthly. For more information about the dataset, visit the FCC website. To provide feedback on this dataset, please contact padl-feedback@googlegroups.com This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery .
Cyclistic Bike Share: A Case Study
kaggle.com
zip
Updated Jul 25, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Casey Kellerhals (2023). Cyclistic Bike Share: A Case Study [Dataset]. https://www.kaggle.com/datasets/caskelle/cyclistic-bike-share-a-case-study/code
Explore at:
zip(269575250 bytes)Available download formats
Dataset updated
Jul 25, 2023
Authors
Casey Kellerhals
Description
The Mission Statement

Cyclistic, a bike sharing company, wants to analyze their user data to find the main differences in behavior between their two types of users. The Casual Riders are those who pay for each ride and the Annual Member who pays a yearly subscription to the service.

PHASE 1 : ASK

Key objectives: 1.Identify The Business Task: - Cyclistic wants to analyze the data to find the key differences between Casual Riders and Annual Members. The goal of this project is to reach out to the casual riders and incentivize them into paying for the annual subscription.

Consider Key Stakeholders:

The key stakeholders in this project are the executive team and the director of marketing, Lily Moreno.

PHASE 2 : Prepare

Key objectives: 1. Download Data And Store It Appropriately - Downloaded the data as .csv files, which were saved in their own folder to keep everything organized. I then uploaded those files into BigQuery for cleaning and analysis. For this project I downloaded all of 2022 and up to May of 2023, as this is the most recent data that I have access to.

Identify How It's Organized

The data is organized into months, from 01-2022 to 05-2023.

Sort and Filter The Data and Determine The Credibility of The Data

For this data I used BigQuery and SQL in order to sort, filter and analyze the credibility of the data. The data is collected first hand by Cyslistic and there is a lot of information to work with. I filtered out the data that I wanted to work with, the data that I chose were the types of bikes, the types of members and the date the bikes were used.

PHASE 3 : Process

Key objectives: 1.Clean The Data and Prepare The Data For Analysis: -I used some simple SQL code in order to determine that no members were missing, that no information was repeated and that there were no misspellings in the data as well.

--no misspelling in either member or casual. This ensures that all results will not have missing information. SELECT DISTINCT member_casual
FROM table

--This shows how many casual riders and members used the service, should add up to the numb of rows in the dataset SELECT member_casual AS member_type, COUNT(*) AS total_riders FROM table GROUP BY member_type

--Shows that every bike has a distinct ID. SELECT DISTINCT ride_id FROM table

--Shows that there are no typos in the types of bikes, so no data will be missing from results. SELECT DISTINCT rideable_type FROM table

PHASE 4 : Analyze

Key objectives: 1. Aggregate Your Data So It's Useful and Accessible -I had to write some SQL code so that I could combine all the data from the different files I had uploaded onto BigQuery

select rideable_type, started_at, ended_at, member_casual from table 1 union all select rideable_type, started_at, ended_at, member_casual from table 2 union all select rideable_type, started_at, ended_at, member_casual from table 3 union all select rideable_type, started_at, ended_at, member_casual from table 4 union all select rideable_type, started_at, ended_at, member_casual from table 5 union all select rideable_type, started_at, ended_at, member_casual from table 6 union all select rideable_type, started_at, ended_at, member_casual from table 7 union all select rideable_type, started_at, ended_at, member_casual from table 8 union all select rideable_type, started_at, ended_at, member_casual from table 9 union all select rideable_type, started_at, ended_at, member_casual from table10 union all select rideable_type, started_at, ended_at, member_casual from table 11 union all select rideable_type, started_at, ended_at, member_casual from table 12 union all select rideable_type, started_at, ended_at, member_casual from table 13 union all select rideable_type, started_at, ended_at, member_casual from table 14 union all select rideable_type, started_at, ended_at, member_casual from table 15 union all select rideable_type, started_at, ended_at, member_casual from table 16 union all select rideable_type, started_at, ended_at, member_casual from table 17

Identify trends and relationships -After I had aggregated all of the data I had chosen, I then ran SQL code to determine the trends and relationships contained within the data. After analyzing the data, I uploaded that data into google sheets to make the graphs to express those trends and make it easier to identify the key differences between Casual Riders and Annual Members.

--This shows how many casual and annual members used bikes SELECT member_casual AS member_type, COUNT(*) AS total_riders FROM Aggregate Data Table GROUP BY member_type

![](https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F14378099%2Fe09c3496bf38d323f8323f52f67...
Intellectual Property Investigations by the USITC
kaggle.com
zip
Updated Feb 12, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Google BigQuery (2019). Intellectual Property Investigations by the USITC [Dataset]. https://www.kaggle.com/bigquery/usitc-investigations
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Feb 12, 2019
Dataset provided by
BigQueryhttps://cloud.google.com/bigquery
Googlehttp://google.com/
Authors
Google BigQuery
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Context

Section 337, Tariff Act of 1930, Investigations of Unfair Practices in Import Trade. Under section 337, the USITC determines whether there is unfair competition in the importation of products into, or their subsequent sale in, the United States. Section 337 prohibits the importation into the US , or the sale of such articles by owners, importers or consignees, of articles which infringe a patent, copyright, trademark, or semiconductor mask work, or where unfair competition or unfair acts exist that can destroy or substantially injure a US industry or prevent one from developing, or restrain or monopolize trade in US commerce. These latter categories are very broad: unfair competition can involve counterfeit, mismarked or misbranded goods, where the sale of the goods are at unfairly low prices, where other antitrust violations take place such as price fixing, market division or the goods violate a standard applicable to such goods.

Content

US International Trade Commission 337Info Unfair Import Investigations Information System contains data on investigations done under Section 337. Section 337 declares the infringement of certain statutory intellectual property rights and other forms of unfair competition in import trade to be unlawful practices. Most Section 337 investigations involve allegations of patent or registered trademark infringement.

Fork this notebook to get started on accessing data in the BigQuery dataset using the BQhelper package to write SQL queries.

Acknowledgements

Data Origin: https://bigquery.cloud.google.com/dataset/patents-public-data:usitc_investigations

"US International Trade Commission 337Info Unfair Import Investigations Information System" by the USITC, for public use.

Banner photo by João Silas on Unsplash

NYC Taxi Trip Data - Google Public Data

kaggle.com

zip

Updated Jun 3, 2022

Facebook

Twitter

Click to copy link

Link copied

Cite

Neil Clack (2022). NYC Taxi Trip Data - Google Public Data [Dataset]. https://www.kaggle.com/datasets/neilclack/nyc-taxi-trip-data-google-public-data/data

Explore at:

zip(442010251 bytes)Available download formats

Dataset updated

Jun 3, 2022

Authors

Neil Clack

Area covered

New York

Description

About

This data set is a subset of the Google BigQuery public datasets - Nyc yellow taxi cab trips data set containing a random 10,000,000 rows of data.
The data has not been cleaned or altered in any way before uploading to Kaggle. I left this up to the notebook creator to accomplish on their own.
This data is not going to be updated in any way in the future and will remain "as-is" This data was pulled at random using "ORDER BY RAND() LIMIT 10,000,000"

taxi_trip_data.csv contains the relevant trip data and pickup and dropoff "zones"
taxi_zone_geo.csv contains the long & lat coordinates of the pickup zone areas. These are all list-like locations of the polygon vertices.

Goal

This dataset was extracted and uploaded for the purpose of experimenting with and learning regression models for price prediction. There is also a lot of room for data cleaning, outliers in the data, and plenty of data to work with for more realistic model training, testing, and validation.

Columns

column	type	nullable	description
vendor_id	text	required	A code indicating the TPEP provider that provided the record. 1= Creative Mobile Technologies, LLC; 2= VeriFone Inc
pickup_datetime	datetime	nullable	The date and time when the meter was engaged.
dropoff_datetime	datetime	nullable	The date and time when the meter was disengaged.
passenger_count	integer	nullable	The number of passengers in the vehicle. This is a driver-entered value
trip_distance	numeric	nullable	The elapsed trip distance in miles reported by the taximeter.
rate_code	string	nullable	The final rate code in effect at the end of the trip. 1= Standard rate 2=JFK 3=Newark 4=Nassau or Westchester 5=Negotiated fare 6=Group ride
store_and_fwd_flag	string	nullable	This flag indicates whether the trip record was held in vehicle memory before sending to the vendor, aka “store and forward,” because the vehicle did not have a connection to the server. Y= store and forward trip N= not a store and forward trip
payment_type	string	nullable	A numeric code signifying how the passenger paid for the trip. 1= Credit card 2= Cash 3= No charge 4= Dispute 5= Unknown 6= Voided trip
fare_amount	numeric	nullable	The time-and-distance fare calculated by the meter
extra	numeric	nullable	Miscellaneous extras and surcharges. Currently, this only includes the \$0.50 and \$1 rush hour and overnight charges.
mta_tax	numeric	nullable	\$0.50 MTA tax that is automatically triggered based on the metered rate in use
tip_amount	numeric	nullable	Tip amount – This field is automatically populated for credit card tips. Cash tips are not included
tolls_amount	numeric	nullable	Total amount of all tolls paid in the trip.
imp_surcharge	numeric	nullable	\$0.30 improvement surcharge assessed trips at the flag drop. The improvement surcharge began being levied in 2015.
total_amount	numeric	nullable	The total amount charged to passengers. Does not include cash tips
pickup_location_id	string	nullable	TLC Taxi Zone in which the taximeter was engaged
dropoff_location_id	string	nullable	TLC Taxi Zone in which the taximeter was disengaged

Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

https://console.cloud.google.com/marketplace/browse?filter=partner:Federal%20Communications%20Commission&hl=ko (2023). FCC Political Ads [Dataset]. https://console.cloud.google.com/marketplace/product/federal-communications-commission/fcc-political-ads?hl=ko

FCC Political Ads

Explore at:

Dataset updated

Aug 22, 2023

Dataset provided by

Googlehttp://google.com/

Description

The FCC political ads public inspection files dataset contains political ad file information that broadcast stations have uploaded to their public inspection files, which are housed on the FCC website. This data includes all political ad files that have been provided by TV and radio broadcast stations, which dates back to 2012 when the FCC started requiring digital uploads of files to its website. Broadcasters are required to maintain this data in their public inspection files for two years, after which the stations are permitted to remove them from the FCC website. This information is uploaded to the FCC’s website in PDF form and not machine-readable. However, this dataset includes a content_info table that contains manual annotations of some data fields like advertiser, gross spend, ad air dates and a link to a copy of the PDF, which can be found on Google Cloud Storage. The manual annotations, which are included only for a subset of the PDFs, come from either ProPublica’s Free the Files effort or from Google and are an experimental dataset. This dataset is a work in progress, with additional PDFs continually annotated. All tables in this dataset are updated monthly. For more information about the dataset, visit the FCC website. To provide feedback on this dataset, please contact padl-feedback@googlegroups.com This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery .

Clear search

Close search

Google apps

Main menu

FCC Political Ads

Cyclistic Bike Share: A Case Study

The Mission Statement

PHASE 1 : ASK

PHASE 2 : Prepare

PHASE 3 : Process

PHASE 4 : Analyze

Intellectual Property Investigations by the USITC

Context

Content

Acknowledgements

NYC Taxi Trip Data - Google Public Data

About

Goal

Columns

FCC Political Ads