The Google Trends dataset will provide critical signals that individual users and businesses alike can leverage to make better data-driven decisions. This dataset simplifies the manual interaction with the existing Google Trends UI by automating and exposing anonymized, aggregated, and indexed search data in BigQuery. This dataset includes the Top 25 stories and Top 25 Rising queries from Google Trends. It will be made available as two separate BigQuery tables, with a set of new top terms appended daily. Each set of Top 25 and Top 25 rising expires after 30 days, and will be accompanied by a rolling five-year window of historical data in 210 distinct locations in the United States. This Google dataset is hosted in Google BigQuery as part of Google Cloud's Datasets solution and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery
You can check the fields description in the documentation: current Keyword database: https://docs.dataforseo.com/v3/databases/google/keywords/?bash; Historical Keyword database: https://docs.dataforseo.com/v3/databases/google/history/keywords/?bash. You don’t have to download fresh data dumps in JSON or CSV – we can deliver data straight to your storage or database. We send terrabytes of data to dozens of customers every month using Amazon S3, Google Cloud Storage, Microsoft Azure Blob, Eleasticsearch, and Google Big Query. Let us know if you’d like to get your data to any other storage or database.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Georgia Google Search Trends: Government Measures: Government Subsidy data was reported at 0.000 Score in 14 May 2025. This stayed constant from the previous number of 0.000 Score for 13 May 2025. Georgia Google Search Trends: Government Measures: Government Subsidy data is updated daily, averaging 0.000 Score from Dec 2021 (Median) to 14 May 2025, with 1261 observations. The data reached an all-time high of 38.000 Score in 02 Jul 2023 and a record low of 0.000 Score in 14 May 2025. Georgia Google Search Trends: Government Measures: Government Subsidy data remains active status in CEIC and is reported by Google Trends. The data is categorized under Global Database’s Georgia – Table GE.Google.GT: Google Search Trends: by Categories.
You can check the fields description in the documentation: current Full database: https://docs.dataforseo.com/v3/databases/google/full/?bash; Historical Full database: https://docs.dataforseo.com/v3/databases/google/history/full/?bash.
Full Google Database is a combination of the Advanced Google SERP Database and Google Keyword Database.
Google SERP Database offers millions of SERPs collected in 67 regions with most of Google’s advanced SERP features, including featured snippets, knowledge graphs, people also ask sections, top stories, and more.
Google Keyword Database encompasses billions of search terms enriched with related Google Ads data: search volume trends, CPC, competition, and more.
This database is available in JSON format only.
You don’t have to download fresh data dumps in JSON – we can deliver data straight to your storage or database. We send terrabytes of data to dozens of customers every month using Amazon S3, Google Cloud Storage, Microsoft Azure Blob, Eleasticsearch, and Google Big Query. Let us know if you’d like to get your data to any other storage or database.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Google Search Trends: Computer & Electronics: Apple data was reported at 30.000 Score in 14 May 2025. This records a decrease from the previous number of 33.000 Score for 13 May 2025. Google Search Trends: Computer & Electronics: Apple data is updated daily, averaging 33.000 Score from Dec 2021 (Median) to 14 May 2025, with 1261 observations. The data reached an all-time high of 100.000 Score in 09 Sep 2024 and a record low of 0.000 Score in 22 Jun 2023. Google Search Trends: Computer & Electronics: Apple data remains active status in CEIC and is reported by Google Trends. The data is categorized under Global Database’s Bulgaria – Table BG.Google.GT: Google Search Trends: by Categories.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Background: Alongside the COVID-19 pandemic, government authorities around the world have had to face a growing infodemic capable of causing serious damages to public health and economy. In this context, the use of infoveillance tools has become a primary necessity.Objective: The aim of this study is to test the reliability of a widely used infoveillance tool which is Google Trends. In particular, the paper focuses on the analysis of relative search volumes (RSVs) quantifying their dependence on the day they are collected.Methods: RSVs of the query coronavirus + covid during February 1—December 4, 2020 (period 1), and February 20—May 18, 2020 (period 2), were collected daily by Google Trends from December 8 to 27, 2020. The survey covered Italian regions and cities, and countries and cities worldwide. The search category was set to all categories. Each dataset was analyzed to observe any dependencies of RSVs from the day they were gathered. To do this, by calling i the country, region, or city under investigation and j the day its RSV was collected, a Gaussian distribution Xi=X(σi,x¯i) was used to represent the trend of daily variations of xij=RSVsij. When a missing value was revealed (anomaly), the affected country, region or city was excluded from the analysis. When the anomalies exceeded 20% of the sample size, the whole sample was excluded from the statistical analysis. Pearson and Spearman correlations between RSVs and the number of COVID-19 cases were calculated day by day thus to highlight any variations related to the day RSVs were collected. Welch’s t-test was used to assess the statistical significance of the differences between the average RSVs of the various countries, regions, or cities of a given dataset. Two RSVs were considered statistical confident when t
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
China Google Search Trends: Online Shopping: Tmall data was reported at 8.000 Score in 14 May 2025. This stayed constant from the previous number of 8.000 Score for 13 May 2025. China Google Search Trends: Online Shopping: Tmall data is updated daily, averaging 0.000 Score from Dec 2021 (Median) to 14 May 2025, with 1261 observations. The data reached an all-time high of 70.000 Score in 22 Jan 2023 and a record low of 0.000 Score in 02 May 2025. China Google Search Trends: Online Shopping: Tmall data remains active status in CEIC and is reported by Google Trends. The data is categorized under Global Database’s China – Table CN.Google.GT: Google Search Trends: by Categories.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The goal of this research is to examine direct answers in Google web search engine. Dataset was collected using Senuto (https://www.senuto.com/). Senuto is as an online tool, that extracts data on websites visibility from Google search engine.
Dataset contains the following elements:
Dataset with visibility structure has 743 798 keywords that were resulting in SERPs with direct answer.
This dataset provides insights by month on how people find State of Iowa agency listings on the web via Google Search and Maps, and what they do once they find it to include providing reviews (ratings), accessing agency websites, requesting directions, and making calls.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This is the Google Search interest data that powers the Visualisation Searching For Health. Google Trends data allows us to see what people are searching for at a very local level. This visualization tracks the top searches for common health issues in the United States, from Cancer to Diabetes, and compares them with the actual location of occurrences for those same health conditions to understand how search data reflects life for millions of Americans.
How does search interest for top health issues change over time? From 2004–2017, the data shows that search interest gradually increased over the past few years. Certain regions show a more significant increase in search interest than others. The increase in search activity is greatest in the Midwest and Northeast, while the changes are noticeably less dramatic in California, Texas, and Idaho. Are people generally becoming more aware of health conditions and health risks?
The search interest data was collected using the Google Trends API. The visualisation also brings in incidences of each condition so they can be compared. The health conditions were hand-selected from the Community Health Status Indicators (CHSI) which provides key indicators for local communities in the United States. The CHSI dataset includes more than 200 measures for each of the 3,141 United States counties. More information about the CHSI can be found on healthdata.gov.
Many striking similarities exist between searches and actual conditions—but the relationship between the Obesity and Diabetes maps stands out the most. “There are many risk factors for type 2 diabetes such as age, race, pregnancy, stress, certain medications, genetics or family history, high cholesterol and obesity. However, the single best predictor of type 2 diabetes is overweight or obesity. Almost 90% of people living with type 2 diabetes are overweight or have obesity. People who are overweight or have obesity have added pressure on their body's ability to use insulin to properly control blood sugar levels, and are therefore more likely to develop diabetes.” —Obesity Society via obesity.org
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Getting Google Trends data for a large number of stocks can be tedious, so I've compiled Google Trends history for 4000+ stocks since 2004 in a quick, easy-to-use format for anyone who needs it.
Every column other than "date" represents a ticker and its search volume from a range from 0-100, 0 being the least volume it has ever gotten and 100 being the most volume it has gotten for stock search history.
Pytrends was used for getting the trends data and yfinance was used for getting stock prices.
Can a stock's Google search volume be used to profitably make investment decisions?
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Google Search Trends: Online Training: Udemy data was reported at 12.000 Score in 14 May 2025. This records an increase from the previous number of 11.000 Score for 13 May 2025. Google Search Trends: Online Training: Udemy data is updated daily, averaging 0.000 Score from Dec 2021 (Median) to 14 May 2025, with 1261 observations. The data reached an all-time high of 100.000 Score in 26 Nov 2023 and a record low of 0.000 Score in 02 May 2025. Google Search Trends: Online Training: Udemy data remains active status in CEIC and is reported by Google Trends. The data is categorized under Global Database’s Tunisia – Table TN.Google.GT: Google Search Trends: by Categories.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Google data search exercises can be used to practice finding data or statistics on a topic of interest, including using Google's own internal tools and by using advanced operators.
OpenWeb Ninja's Google Images Data (Google SERP Data) API provides real-time image search capabilities for images sourced from all public sources on the web.
The API enables you to search and access more than 100 billion images from across the web including advanced filtering capabilities as supported by Google Advanced Image Search. The API provides Google Images Data (Google SERP Data) including details such as image URL, title, size information, thumbnail, source information, and more data points. The API supports advanced filtering and options such as file type, image color, usage rights, creation time, and more. In addition, any Advanced Google Search operators can be used with the API.
OpenWeb Ninja's Google Images Data & Google SERP Data API common use cases:
Creative Media Production: Enhance digital content with a vast array of real-time images, ensuring engaging and brand-aligned visuals for blogs, social media, and advertising.
AI Model Enhancement: Train and refine AI models with diverse, annotated images, improving object recognition and image classification accuracy.
Trend Analysis: Identify emerging market trends and consumer preferences through real-time visual data, enabling proactive business decisions.
Innovative Product Design: Inspire product innovation by exploring current design trends and competitor products, ensuring market-relevant offerings.
Advanced Search Optimization: Improve search engines and applications with enriched image datasets, providing users with accurate, relevant, and visually appealing search results.
OpenWeb Ninja's Annotated Imagery Data & Google SERP Data Stats & Capabilities:
100B+ Images: Access an extensive database of over 100 billion images.
Images Data from all Public Sources (Google SERP Data): Benefit from a comprehensive aggregation of image data from various public websites, ensuring a wide range of sources and perspectives.
Extensive Search and Filtering Capabilities: Utilize advanced search operators and filters to refine image searches by file type, color, usage rights, creation time, and more, making it easy to find exactly what you need.
Rich Data Points: Each image comes with more than 10 data points, including URL, title (annotation), size information, thumbnail, and source information, providing a detailed context for each image.
The number of times during the month someone searched the name of a State of Iowa Office with a Google My Business profile using Google Search or while on Google Maps.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
The dataset provides 12 months (August 2016 to August 2017) of obfuscated Google Analytics 360 data from the Google Merchandise Store , a real ecommerce store that sells Google-branded merchandise, in BigQuery. It’s a great way analyze business data and learn the benefits of using BigQuery to analyze Analytics 360 data Learn more about the data The data includes The data is typical of what an ecommerce website would see and includes the following information:Traffic source data: information about where website visitors originate, including data about organic traffic, paid search traffic, and display trafficContent data: information about the behavior of users on the site, such as URLs of pages that visitors look at, how they interact with content, etc. Transactional data: information about the transactions on the Google Merchandise Store website.Limitations: All users have view access to the dataset. This means you can query the dataset and generate reports but you cannot complete administrative tasks. Data for some fields is obfuscated such as fullVisitorId, or removed such as clientId, adWordsClickInfo and geoNetwork. “Not available in demo dataset” will be returned for STRING values and “null” will be returned for INTEGER values when querying the fields containing no data.This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery
CSV version of Looker Ecommerce Dataset.
Overview Dataset in BigQuery TheLook is a fictitious eCommerce clothing site developed by the Looker team. The dataset contains information >about customers, products, orders, logistics, web events and digital marketing campaigns. The contents of this >dataset are synthetic, and are provided to industry practitioners for the purpose of product discovery, testing, and >evaluation. This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This >means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on >this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public >datasets.
distribution_centers.csv
id
: Unique identifier for each distribution center.name
: Name of the distribution center.latitude
: Latitude coordinate of the distribution center.longitude
: Longitude coordinate of the distribution center.events.csv
id
: Unique identifier for each event.user_id
: Identifier for the user associated with the event.sequence_number
: Sequence number of the event.session_id
: Identifier for the session during which the event occurred.created_at
: Timestamp indicating when the event took place.ip_address
: IP address from which the event originated.city
: City where the event occurred.state
: State where the event occurred.postal_code
: Postal code of the event location.browser
: Web browser used during the event.traffic_source
: Source of the traffic leading to the event.uri
: Uniform Resource Identifier associated with the event.event_type
: Type of event recorded.inventory_items.csv
id
: Unique identifier for each inventory item.product_id
: Identifier for the associated product.created_at
: Timestamp indicating when the inventory item was created.sold_at
: Timestamp indicating when the item was sold.cost
: Cost of the inventory item.product_category
: Category of the associated product.product_name
: Name of the associated product.product_brand
: Brand of the associated product.product_retail_price
: Retail price of the associated product.product_department
: Department to which the product belongs.product_sku
: Stock Keeping Unit (SKU) of the product.product_distribution_center_id
: Identifier for the distribution center associated with the product.order_items.csv
id
: Unique identifier for each order item.order_id
: Identifier for the associated order.user_id
: Identifier for the user who placed the order.product_id
: Identifier for the associated product.inventory_item_id
: Identifier for the associated inventory item.status
: Status of the order item.created_at
: Timestamp indicating when the order item was created.shipped_at
: Timestamp indicating when the order item was shipped.delivered_at
: Timestamp indicating when the order item was delivered.returned_at
: Timestamp indicating when the order item was returned.orders.csv
order_id
: Unique identifier for each order.user_id
: Identifier for the user who placed the order.status
: Status of the order.gender
: Gender information of the user.created_at
: Timestamp indicating when the order was created.returned_at
: Timestamp indicating when the order was returned.shipped_at
: Timestamp indicating when the order was shipped.delivered_at
: Timestamp indicating when the order was delivered.num_of_item
: Number of items in the order.products.csv
id
: Unique identifier for each product.cost
: Cost of the product.category
: Category to which the product belongs.name
: Name of the product.brand
: Brand of the product.retail_price
: Retail price of the product.department
: Department to which the product belongs.sku
: Stock Keeping Unit (SKU) of the product.distribution_center_id
: Identifier for the distribution center associated with the product.users.csv
id
: Unique identifier for each user.first_name
: First name of the user.last_name
: Last name of the user.email
: Email address of the user.age
: Age of the user.gender
: Gender of the user.state
: State where t...Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
This dataset was created during the Programming Language Ecosystem project from TU Wien using the code inside the repository https://github.com/ValentinFutterer/UsageOfProgramminglanguages2011-2023?tab=readme-ov-file.
The centerpiece of this repository is the usage_of_programming_languages_2011-2023.csv. This csv file shows the popularity of programming languages over the last 12 years in yearly increments. The repository also contains graphs created with the dataset. To get an accurate estimate on the popularity of programming languages, this dataset was created using 3 vastly different sources.
The dataset was created using the github repository above. As input data, three public datasets where used.
Taken from https://www.kaggle.com/datasets/pelmers/github-repository-metadata-with-5-stars/ by Peter Elmers. It is licensed under CC BY 4.0 https://creativecommons.org/licenses/by/4.0/. It shows metadata information (no code) of all github repositories with more than 5 stars.
Taken from https://github.com/pypl/pypl.github.io/tree/master, put online by the user pcarbonn. It is licensed under CC BY 3.0 https://creativecommons.org/licenses/by/3.0/. It shows from 2004 to 2023 for each month the share of programming related google searches per language.
Taken from https://insights.stackoverflow.com/survey. It is licensed under Open Data Commons Open Database License (ODbL) v1.0 https://opendatacommons.org/licenses/odbl/1-0/. It shows from 2011 to 2023 the results of the yearly stackoverflow developer survey.
All these datasets were downloaded on the 12.12.2023. The datasets are all in the github repository above
The dataset contains a column for the year and then many columns for the different languages, denoting their usage in percent. Additionally, vertical barcharts and piecharts for each year plus a line graph for each language over the whole timespan as png's are provided.
The languages that are going to be considered for the project can be seen here:
- Python
- C
- C++
- Java
- C#
- JavaScript
- PHP
- SQL
- Assembly
- Scratch
- Fortran
- Go
- Kotlin
- Delphi
- Swift
- Rust
- Ruby
- R
- COBOL
- F#
- Perl
- TypeScript
- Haskell
- Scala
This project is licensed under the Open Data Commons Open Database License (ODbL) v1.0 https://opendatacommons.org/licenses/odbl/1-0/ license.
TLDR: You are free to share, adapt, and create derivative works from this dataser as long as you attribute me, keep the database open (if you redistribute it), and continue to share-alike any adapted database under the ODbl.
Thanks go out to
- stackoverflow https://insights.stackoverflow.com/survey for providing the data from the yearly stackoverflow developer survey.
- the PYPL survey, https://github.com/pypl/pypl.github.io/tree/master for providing google search data.
- Peter Elmers, for crawling metadata on github repositories and providing the data https://www.kaggle.com/datasets/pelmers/github-repository-metadata-with-5-stars/.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Moldova Google Search Trends: Online Training: Udemy data was reported at 4.000 Score in 15 May 2025. This records a decrease from the previous number of 7.000 Score for 14 May 2025. Moldova Google Search Trends: Online Training: Udemy data is updated daily, averaging 0.000 Score from Dec 2021 (Median) to 15 May 2025, with 1262 observations. The data reached an all-time high of 100.000 Score in 15 Oct 2022 and a record low of 0.000 Score in 06 May 2025. Moldova Google Search Trends: Online Training: Udemy data remains active status in CEIC and is reported by Google Trends. The data is categorized under Global Database’s Moldova – Table MD.Google.GT: Google Search Trends: by Categories.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We present Qbias, two novel datasets that promote the investigation of bias in online news search as described in
Fabian Haak and Philipp Schaer. 2023. 𝑄𝑏𝑖𝑎𝑠 - A Dataset on Media Bias in Search Queries and Query Suggestions. In Proceedings of ACM Web Science Conference (WebSci’23). ACM, New York, NY, USA, 6 pages. https://doi.org/10.1145/3578503.3583628.
Dataset 1: AllSides Balanced News Dataset (allsides_balanced_news_headlines-texts.csv)
The dataset contains 21,747 news articles collected from AllSides balanced news headline roundups in November 2022 as presented in our publication. The AllSides balanced news feature three expert-selected U.S. news articles from sources of different political views (left, right, center), often featuring spin bias, and slant other forms of non-neutral reporting on political news. All articles are tagged with a bias label by four expert annotators based on the expressed political partisanship, left, right, or neutral. The AllSides balanced news aims to offer multiple political perspectives on important news stories, educate users on biases, and provide multiple viewpoints. Collected data further includes headlines, dates, news texts, topic tags (e.g., "Republican party", "coronavirus", "federal jobs"), and the publishing news outlet. We also include AllSides' neutral description of the topic of the articles. Overall, the dataset contains 10,273 articles tagged as left, 7,222 as right, and 4,252 as center.
To provide easier access to the most recent and complete version of the dataset for future research, we provide a scraping tool and a regularly updated version of the dataset at https://github.com/irgroup/Qbias. The repository also contains regularly updated more recent versions of the dataset with additional tags (such as the URL to the article). We chose to publish the version used for fine-tuning the models on Zenodo to enable the reproduction of the results of our study.
Dataset 2: Search Query Suggestions (suggestions.csv)
The second dataset we provide consists of 671,669 search query suggestions for root queries based on tags of the AllSides biased news dataset. We collected search query suggestions from Google and Bing for the 1,431 topic tags, that have been used for tagging AllSides news at least five times, approximately half of the total number of topics. The topic tags include names, a wide range of political terms, agendas, and topics (e.g., "communism", "libertarian party", "same-sex marriage"), cultural and religious terms (e.g., "Ramadan", "pope Francis"), locations and other news-relevant terms. On average, the dataset contains 469 search queries for each topic. In total, 318,185 suggestions have been retrieved from Google and 353,484 from Bing.
The file contains a "root_term" column based on the AllSides topic tags. The "query_input" column contains the search term submitted to the search engine ("search_engine"). "query_suggestion" and "rank" represents the search query suggestions at the respective positions returned by the search engines at the given time of search "datetime". We scraped our data from a US server saved in "location".
We retrieved ten search query suggestions provided by the Google and Bing search autocomplete systems for the input of each of these root queries, without performing a search. Furthermore, we extended the root queries by the letters a to z (e.g., "democrats" (root term) >> "democrats a" (query input) >> "democrats and recession" (query suggestion)) to simulate a user's input during information search and generate a total of up to 270 query suggestions per topic and search engine. The dataset we provide contains columns for root term, query input, and query suggestion for each suggested query. The location from which the search is performed is the location of the Google servers running Colab, in our case Iowa in the United States of America, which is added to the dataset.
AllSides Scraper
At https://github.com/irgroup/Qbias, we provide a scraping tool, that allows for the automatic retrieval of all available articles at the AllSides balanced news headlines.
We want to provide an easy means of retrieving the news and all corresponding information. For many tasks it is relevant to have the most recent documents available. Thus, we provide this Python-based scraper, that scrapes all available AllSides news articles and gathers available information. By providing the scraper we facilitate access to a recent version of the dataset for other researchers.
The Google Trends dataset will provide critical signals that individual users and businesses alike can leverage to make better data-driven decisions. This dataset simplifies the manual interaction with the existing Google Trends UI by automating and exposing anonymized, aggregated, and indexed search data in BigQuery. This dataset includes the Top 25 stories and Top 25 Rising queries from Google Trends. It will be made available as two separate BigQuery tables, with a set of new top terms appended daily. Each set of Top 25 and Top 25 rising expires after 30 days, and will be accompanied by a rolling five-year window of historical data in 210 distinct locations in the United States. This Google dataset is hosted in Google BigQuery as part of Google Cloud's Datasets solution and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery