Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the Italian Coronavirus data repository from the Dipartimento della Protezione Civile . This dataset was created in response to the Coronavirus public health emergency in Italy and includes COVID-19 cases reported by region. More information on the data repository is available here . For additional information on Italy’s situation tracking and reporting, see the department’s Coronavirus site and interactive dashboard . This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery . This dataset is hosted in both the EU and US regions of BigQuery. See the links below for the appropriate dataset copy: US region EU region This dataset has significant public interest in light of the COVID-19 crisis. All bytes processed in queries against this dataset will be zeroed out, making this part of the query free. Data joined with the dataset will be billed at the normal rate to prevent abuse. After September 15, queries over these datasets will revert to the normal billing rate.
The FCC political ads public inspection files dataset contains political ad file information that broadcast stations have uploaded to their public inspection files, which are housed on the FCC website. This data includes all political ad files that have been provided by TV and radio broadcast stations, which dates back to 2012 when the FCC started requiring digital uploads of files to its website. Broadcasters are required to maintain this data in their public inspection files for two years, after which the stations are permitted to remove them from the FCC website. This information is uploaded to the FCC’s website in PDF form and not machine-readable. However, this dataset includes a content_info table that contains manual annotations of some data fields like advertiser, gross spend, ad air dates and a link to a copy of the PDF, which can be found on Google Cloud Storage. The manual annotations, which are included only for a subset of the PDFs, come from either ProPublica’s Free the Files effort or from Google and are an experimental dataset. This dataset is a work in progress, with additional PDFs continually annotated. All tables in this dataset are updated monthly. For more information about the dataset, visit the FCC website. To provide feedback on this dataset, please contact padl-feedback@googlegroups.com This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery .
Stack Overflow (SO) is the largest Q&A website for software developers, providing a huge amount of copyable code snippets. Recent studies have shown that developers regularly copy those snippets into their software projects, often without the required attribution. Beside possible licensing issues, maintenance issues may arise, because the snippets evolve on SO, but the developers who copied the code are not aware of these changes. To help researchers investigate the evolution of code snippets on SO and their relation to other platforms like GitHub, we build SOTorrent, an open data set based on data from the official SO data dump and the Google BigQuery GitHub data set. SOTorrent provides access to the version history of SO content on the level of whole posts and individual text or code blocks. Moreover, it links SO content to external resources in two ways: (1) by extracting linked URLs from text blocks of SO posts and (2) by providing a table with links to SO posts found in the source code of all projects in the BigQuery GitHub data set.
Moved to this Zenodo record: https://zenodo.org/record/1135262
Stack Overflow (SO) is the largest Q&A website for software developers, providing a huge amount of copyable code snippets. Recent studies have shown that developers regularly copy those snippets into their software projects, often without the required attribution. Beside possible licensing issues, maintenance issues may arise, because the snippets evolve on SO, but the developers who copied the code are not aware of these changes. To help researchers investigate the evolution of code snippets on SO and their relation to other platforms like GitHub, we build SOTorrent, an open data set based on data from the official SO data dump and the Google BigQuery GitHub data set. SOTorrent provides access to the version history of SO content on the level of whole posts and individual text or code blocks. Moreover, it links SO content to external resources in two ways: (1) by extracting linked URLs from text blocks of SO posts and (2) by providing a table with links to SO posts found in the source code of all projects in the BigQuery GitHub data set.
This dataset is maintained by the European Centre for Disease Prevention and Control (ECDC) and reports on the geographic distribution of COVID-19 cases worldwide. This data includes COVID-19 reported cases and deaths broken out by country. This data can be visualized via ECDC’s Situation Dashboard . More information on ECDC’s response to COVID-19 is available here . This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery . This dataset is hosted in both the EU and US regions of BigQuery. See the links below for the appropriate dataset copy: US region EU region This dataset has significant public interest in light of the COVID-19 crisis. All bytes processed in queries against this dataset will be zeroed out, making this part of the query free. Data joined with the dataset will be billed at the normal rate to prevent abuse. After September 15, queries over these datasets will revert to the normal billing rate. Users of ECDC public-use data files must comply with data use restrictions to ensure that the information will be used solely for statistical analysis or reporting purposes.
The GenCat Mobile Coverage app is an initiative of the Government of Catalonia to crowdsource data collection on the state of mobile telephone network coverage in Catalonia. The platform uses an Android app to record citizens data through their mobile devices on the level of coverage per operator, network (2G, 3G and 4G) and the device's location. This dataset contains the platform data over the 2015-2017 period. This data might be used to analyze the quality of mobile coverage in Catalonia of the four main operators (Movistar, Vodafone, Orange and Yoigo) and filter data according to the technology used (2G, 3G or 4G). Additionally the data enables the identification of areas in Catalonia that need to improve their mobile coverage with the final goal of helping to improve the efficiency of basic services for the general public. Identical copies of this dataset are hosted in BigQuery's US region and EU region. Below are the direct links to each copy of the dataset. EU Region - Catalonia Mobile Coverage US Region - Catalonia Mobile Coverage The Government of Catalonia recommends the following source citation formula for the reuse of datasets by companies or users: Source: Government of Catalonia. [Digital Policies and Public Administration] This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the Italian Coronavirus data repository from the Dipartimento della Protezione Civile . This dataset was created in response to the Coronavirus public health emergency in Italy and includes COVID-19 cases reported by region. More information on the data repository is available here . For additional information on Italy’s situation tracking and reporting, see the department’s Coronavirus site and interactive dashboard . This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery . This dataset is hosted in both the EU and US regions of BigQuery. See the links below for the appropriate dataset copy: US region EU region This dataset has significant public interest in light of the COVID-19 crisis. All bytes processed in queries against this dataset will be zeroed out, making this part of the query free. Data joined with the dataset will be billed at the normal rate to prevent abuse. After September 15, queries over these datasets will revert to the normal billing rate.