100+ datasets found

Data from: San Francisco Open Data
kaggle.com
zip
Updated Mar 20, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DataSF (2019). San Francisco Open Data [Dataset]. https://www.kaggle.com/datasf/san-francisco
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Mar 20, 2019
Dataset authored and provided by
DataSF
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
San Francisco
Description
Context

DataSF seeks to transform the way that the City of San Francisco works -- through the use of data.

https://datasf.org/about/

Content

This dataset contains the following tables: ['311_service_requests', 'bikeshare_stations', 'bikeshare_status', 'bikeshare_trips', 'film_locations', 'sffd_service_calls', 'sfpd_incidents', 'street_trees']

This data includes all San Francisco 311 service requests from July 2008 to the present, and is updated daily. 311 is a non-emergency number that provides access to non-emergency municipal services.

This data includes fire unit responses to calls from April 2000 to present and is updated daily. Data contains the call number, incident number, address, unit identifier, call type, and disposition. Relevant time intervals are also included. Because this dataset is based on responses, and most calls involved multiple fire units, there are multiple records for each call number. Addresses are associated with a block number, intersection or call box.

This data includes incidents from the San Francisco Police Department (SFPD) Crime Incident Reporting system, from January 2003 until the present (2 weeks ago from current date). The dataset is updated daily. Please note: the SFPD has implemented a new system for tracking crime. This dataset is still sourced from the old system, which is in the process of being retired (a multi-year process).

This data includes a list of San Francisco Department of Public Works maintained street trees including: planting date, species, and location. Data includes 1955 to present.

This dataset is deprecated and not being updated.

Fork this kernel to get started with this dataset.

Acknowledgements

http://datasf.org/

https://cloud.google.com/bigquery/public-data/sfo-311

https://cloud.google.com/bigquery/public-data/sffd-service-calls

https://cloud.google.com/bigquery/public-data/sfpd-reports

https://cloud.google.com/bigquery/public-data/sfo-trees

Dataset Source: SF OpenData. This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - http://sfgov.org/ - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

Banner Photo by @meric from Unplash.

Inspiration

Which neighborhoods have the highest proportion of offensive graffiti?

Which complaint is most likely to be made using Twitter and in which neighborhood?

What are the most complained about Muni stops in San Francisco?

What are the top 10 incident types that the San Francisco Fire Department responds to?

How many medical incidents and structure fires are there in each neighborhood?

What’s the average response time for each type of dispatched vehicle?

Which category of police incidents have historically been the most common in San Francisco?

What were the most common police incidents in the category of LARCENY/THEFT in 2016?

Which non-criminal incidents saw the biggest reporting change from 2015 to 2016?

What is the average tree diameter?

What is the highest number of a particular species of tree planted in a single year?

Which San Francisco locations feature the largest number of trees?
Trust level French people have in Google and Facebook to ensure data...
statista.com
Updated Mar 21, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2022). Trust level French people have in Google and Facebook to ensure data protection 2019 [Dataset]. https://www.statista.com/statistics/1010095/trust-google-facebook-developing-better-data-protection-france/
Explore at:
Dataset updated
Mar 21, 2022
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
May 15, 2019 - May 16, 2019
Area covered
France
Description
This pie chart displays the level of trust people have in Google and Facebook to develop better tools for personal data protection on the Internet in France in a survey from 2019. It shows that 37 percent of the respondents rather did not trust those companies to ensure data protection, while 35 percent declared they rather trusted them.
Global market share of leading desktop search engines 2015-2025
statista.com
ai-chatbox.pro
Updated Apr 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Global market share of leading desktop search engines 2015-2025 [Dataset]. https://www.statista.com/statistics/216573/worldwide-market-share-of-search-engines/
Explore at:
Dataset updated
Apr 28, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Jan 2015 - Mar 2025
Area covered
Worldwide
Description
As of March 2025, Google represented 79.1 percent of the global online search engine market on desktop devices. Despite being much ahead of its competitors, this represents the lowest share ever recorded by the search engine in these devices for over two decades. Meanwhile, its long-time competitor Bing accounted for 12.21 percent, as tools like Yahoo and Yandex held shares of over 2.9 percent each. Google and the global search market Ever since the introduction of Google Search in 1997, the company has dominated the search engine market, while the shares of all other tools has been rather lopsided. The majority of Google revenues are generated through advertising. Its parent corporation, Alphabet, was one of the biggest internet companies worldwide as of 2024, with a market capitalization of 2.02 trillion U.S. dollars. The company has also expanded its services to mail, productivity tools, enterprise products, mobile devices, and other ventures. As a result, Google earned one of the highest tech company revenues in 2024 with roughly 348.16 billion U.S. dollars. Search engine usage in different countries Google is the most frequently used search engine worldwide. But in some countries, its alternatives are leading or competing with it to some extent. As of the last quarter of 2023, more than 63 percent of internet users in Russia used Yandex, whereas Google users represented little over 33 percent. Meanwhile, Baidu was the most used search engine in China, despite a strong decrease in the percentage of internet users in the country accessing it. In other countries, like Japan and Mexico, people tend to use Yahoo along with Google. By the end of 2024, nearly half of the respondents in Japan said that they had used Yahoo in the past four weeks. In the same year, over 21 percent of users in Mexico said they used Yahoo.
Google Ads Transparency Center
console.cloud.google.com
Updated Sep 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
https://console.cloud.google.com/marketplace/browse?filter=partner:BigQuery%20Public%20Data&hl=de&inv=1&invt=Ab3rsg (2023). Google Ads Transparency Center [Dataset]. https://console.cloud.google.com/marketplace/product/bigquery-public-data/google-ads-transparency-center?hl=de
Explore at:
Dataset updated
Sep 6, 2023
Dataset provided by
BigQueryhttps://cloud.google.com/bigquery
Googlehttp://google.com/
Description
This dataset contains two tables: creative_stats and removed_creative_stats. The creative_stats table contains information about advertisers that served ads in the European Economic Area or Turkey: their legal name, verification status, disclosed name, and location. It also includes ad specific information: impression ranges per region (including aggregate impressions for the European Economic Area), first shown and last shown dates, which criteria were used in audience selection, the format of the ad, the ad topic and whether the ad is funded by Google Ad Grants program. A link to the ad in the Google Ads Transparency Center is also provided. The removed_creative_stats table contains information about ads that served in the European Economic Area that Google removed: where and why they were removed and per-region information on when they served. The removed_creative_stats table also contains a link to the Google Ads Transparency Center for the removed ad. Data for both tables updates periodically and may be delayed from what appears on the Google Ads Transparency Center website. About BigQuery This data is hosted in Google BigQuery for users to easily query using SQL. Note that to use BigQuery, users must have a Google account and create a GCP project. This public dataset is included in BigQuery's 1TB/mo of free tier processing. Each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery . Download Dataset This public dataset is also hosted in Google Cloud Storage here and available free to use. Use this quick start guide to quickly learn how to access public datasets on Google Cloud Storage. We provide the raw data in JSON format, sharded across multiple files to support easier download of the large dataset. A README file which describes the data structure and our Terms of Service (also listed below) is included with the dataset. You can also download the results from a custom query. See here for options and instructions. Signed out users can download the full dataset by using the gCloud CLI. Follow the instructions here to download and install the gCloud CLI. To remove the login requirement, run "$ gcloud config set auth/disable_credentials True" To download the dataset, run "$ gcloud storage cp gs://ads-transparency-center/* . -R" This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery .
d
Google SERP Data, Web Search Data, Google Images Data | Real-Time API
datarade.ai
.json, .csv
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
OpenWeb Ninja, Google SERP Data, Web Search Data, Google Images Data | Real-Time API [Dataset]. https://datarade.ai/data-products/openweb-ninja-google-data-google-image-data-google-serp-d-openweb-ninja
Explore at:
.json, .csvAvailable download formats
Dataset authored and provided by
OpenWeb Ninja
Area covered
Burundi, Uganda, Panama, South Georgia and the South Sandwich Islands, Ireland, Tokelau, Grenada, Barbados, Virgin Islands (U.S.), Uruguay
Description
OpenWeb Ninja's Google Images Data (Google SERP Data) API provides real-time image search capabilities for images sourced from all public sources on the web.

The API enables you to search and access more than 100 billion images from across the web including advanced filtering capabilities as supported by Google Advanced Image Search. The API provides Google Images Data (Google SERP Data) including details such as image URL, title, size information, thumbnail, source information, and more data points. The API supports advanced filtering and options such as file type, image color, usage rights, creation time, and more. In addition, any Advanced Google Search operators can be used with the API.

OpenWeb Ninja's Google Images Data & Google SERP Data API common use cases:

Creative Media Production: Enhance digital content with a vast array of real-time images, ensuring engaging and brand-aligned visuals for blogs, social media, and advertising.

AI Model Enhancement: Train and refine AI models with diverse, annotated images, improving object recognition and image classification accuracy.

Trend Analysis: Identify emerging market trends and consumer preferences through real-time visual data, enabling proactive business decisions.

Innovative Product Design: Inspire product innovation by exploring current design trends and competitor products, ensuring market-relevant offerings.

Advanced Search Optimization: Improve search engines and applications with enriched image datasets, providing users with accurate, relevant, and visually appealing search results.

OpenWeb Ninja's Annotated Imagery Data & Google SERP Data Stats & Capabilities:

100B+ Images: Access an extensive database of over 100 billion images.

Images Data from all Public Sources (Google SERP Data): Benefit from a comprehensive aggregation of image data from various public websites, ensuring a wide range of sources and perspectives.

Extensive Search and Filtering Capabilities: Utilize advanced search operators and filters to refine image searches by file type, color, usage rights, creation time, and more, making it easy to find exactly what you need.

Rich Data Points: Each image comes with more than 10 data points, including URL, title (annotation), size information, thumbnail, and source information, providing a detailed context for each image.
E
Google Drive Statistics And Facts (2025)
electroiq.com
Updated Apr 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Electro IQ (2025). Google Drive Statistics And Facts (2025) [Dataset]. https://electroiq.com/stats/google-drive-statistics/
Explore at:
Dataset updated
Apr 2, 2025
Dataset authored and provided by
Electro IQ
License
https://electroiq.com/privacy-policyhttps://electroiq.com/privacy-policy
Time period covered
2022 - 2032
Area covered
Global
Description
Introduction

Google Drive Statistics: Google Drive, which was launched by Google in 2012, currently serves as the biggest pillar of cloud storage and collaboration. It has enabled users to access and keep files on a single platform that is made portable and synchronised across devices. By the end of 2024, its adoption rates will reflect users' inclination toward the integral roles it has assumed in their personal and professional lives.

The article discusses the Google Drive statistics, including user engagement and market presence, along with data security and performance in terms of finances.
USA Name Data
kaggle.com
zip
Updated Feb 12, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data.gov (2019). USA Name Data [Dataset]. https://www.kaggle.com/datasets/datagov/usa-names
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Feb 12, 2019
Dataset provided by
Data.govhttps://data.gov/
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
United States
Description
Context

Cultural diversity in the U.S. has led to great variations in names and naming traditions and names have been used to express creativity, personality, cultural identity, and values. Source: https://en.wikipedia.org/wiki/Naming_in_the_United_States

Content

This public dataset was created by the Social Security Administration and contains all names from Social Security card applications for births that occurred in the United States after 1879. Note that many people born before 1937 never applied for a Social Security card, so their names are not included in this data. For others who did apply, records may not show the place of birth, and again their names are not included in the data.

All data are from a 100% sample of records on Social Security card applications as of the end of February 2015. To safeguard privacy, the Social Security Administration restricts names to those with at least 5 occurrences.

Fork this kernel to get started with this dataset.

Acknowledgements

https://bigquery.cloud.google.com/dataset/bigquery-public-data:usa_names

https://cloud.google.com/bigquery/public-data/usa-names

Dataset Source: Data.gov. This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source — http://www.data.gov/privacy-policy#data_policy — and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

Banner Photo by @dcp from Unplash.

Inspiration

What are the most common names?

What are the most common female names?

Are there more female or male names?

Female names by a wide margin?
i
Online Learning Global Queries Dataset: A Comprehensive Dataset of What...
ieee-dataport.org
Updated May 11, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Isabella Hall (2022). Online Learning Global Queries Dataset: A Comprehensive Dataset of What People from Different Countries ask Google about Online Learning [Dataset]. https://ieee-dataport.org/documents/online-learning-global-queries-dataset-comprehensive-dataset-what-people-different
Explore at:
Dataset updated
May 11, 2022
Authors
Isabella Hall
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Any work using this dataset should cite the following paper:
Frequency of using Google Docs in the U.S. 2018
statista.com
Updated Jul 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Frequency of using Google Docs in the U.S. 2018 [Dataset]. https://www.statista.com/forecasts/1011649/frequency-of-using-google-docs-in-the-us
Explore at:
Dataset updated
Jul 11, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Oct 26, 2018 - Nov 5, 2018
Area covered
United States
Description
The displayed data on the frequency of using Google Docs shows results of an exclusive Statista survey conducted in the United States in 2018. Some ** percent of respondents answered the question ''How often do you use Google Docs?'' with ''Daily''.The Survey Data Table for the Statista survey Tech Giants and Digital Services in the United States 2019 contains the complete tables for the survey including various column headings.
u
Data from: Google Analytics & Twitter dataset from a movies, TV series and...
portalcientificovalencia.univeuropea.com
figshare.com
Updated 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yeste, Víctor; Yeste, Víctor (2024). Google Analytics & Twitter dataset from a movies, TV series and videogames website [Dataset]. https://portalcientificovalencia.univeuropea.com/documentos/67321ed3aea56d4af0485dc8
Explore at:
Dataset updated
2024
Authors
Yeste, Víctor; Yeste, Víctor
Description
Author: Víctor Yeste. Universitat Politècnica de Valencia.The object of this study is the design of a cybermetric methodology whose objectives are to measure the success of the content published in online media and the possible prediction of the selected success variables.In this case, due to the need to integrate data from two separate areas, such as web publishing and the analysis of their shares and related topics on Twitter, has opted for programming as you access both the Google Analytics v4 reporting API and Twitter Standard API, always respecting the limits of these.The website analyzed is hellofriki.com. It is an online media whose primary intention is to solve the need for information on some topics that provide daily a vast number of news in the form of news, as well as the possibility of analysis, reports, interviews, and many other information formats. All these contents are under the scope of the sections of cinema, series, video games, literature, and comics.This dataset has contributed to the elaboration of the PhD Thesis:Yeste Moreno, VM. (2021). Diseño de una metodología cibermétrica de cálculo del éxito para la optimización de contenidos web [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/176009Data have been obtained from each last-minute news article published online according to the indicators described in the doctoral thesis. All related data are stored in a database, divided into the following tables:tesis_followers: User ID list of media account followers.tesis_hometimeline: data from tweets posted by the media account sharing breaking news from the web.status_id: Tweet IDcreated_at: date of publicationtext: content of the tweetpath: URL extracted after processing the shortened URL in textpost_shared: Article ID in WordPress that is being sharedretweet_count: number of retweetsfavorite_count: number of favoritestesis_hometimeline_other: data from tweets posted by the media account that do not share breaking news from the web. Other typologies, automatic Facebook shares, custom tweets without link to an article, etc. With the same fields as tesis_hometimeline.tesis_posts: data of articles published by the web and processed for some analysis.stats_id: Analysis IDpost_id: Article ID in WordPresspost_date: article publication date in WordPresspost_title: title of the articlepath: URL of the article in the middle webtags: Tags ID or WordPress tags related to the articleuniquepageviews: unique page viewsentrancerate: input ratioavgtimeonpage: average visit timeexitrate: output ratiopageviewspersession: page views per sessionadsense_adunitsviewed: number of ads viewed by usersadsense_viewableimpressionpercent: ad display ratioadsense_ctr: ad click ratioadsense_ecpm: estimated ad revenue per 1000 page viewstesis_stats: data from a particular analysis, performed at each published breaking news item. Fields with statistical values can be computed from the data in the other tables, but total and average calculations are saved for faster and easier further processing.id: ID of the analysisphase: phase of the thesis in which analysis has been carried out (right now all are 1)time: "0" if at the time of publication, "1" if 14 days laterstart_date: date and time of measurement on the day of publicationend_date: date and time when the measurement is made 14 days latermain_post_id: ID of the published article to be analysedmain_post_theme: Main section of the published article to analyzesuperheroes_theme: "1" if about superheroes, "0" if nottrailer_theme: "1" if trailer, "0" if notname: empty field, possibility to add a custom name manuallynotes: empty field, possibility to add personalized notes manually, as if some tag has been removed manually for being considered too generic, despite the fact that the editor put itnum_articles: number of articles analysednum_articles_with_traffic: number of articles analysed with traffic (which will be taken into account for traffic analysis)num_articles_with_tw_data: number of articles with data from when they were shared on the media’s Twitter accountnum_terms: number of terms analyzeduniquepageviews_total: total page viewsuniquepageviews_mean: average page viewsentrancerate_mean: average input ratioavgtimeonpage_mean: average duration of visitsexitrate_mean: average output ratiopageviewspersession_mean: average page views per sessiontotal: total of ads viewedadsense_adunitsviewed_mean: average of ads viewedadsense_viewableimpressionpercent_mean: average ad display ratioadsense_ctr_mean: average ad click ratioadsense_ecpm_mean: estimated ad revenue per 1000 page viewsTotal: total incomeretweet_count_mean: average incomefavorite_count_total: total of favoritesfavorite_count_mean: average of favoritesterms_ini_num_tweets: total tweets on the terms on the day of publicationterms_ini_retweet_count_total: total retweets on the terms on the day of publicationterms_ini_retweet_count_mean: average retweets on the terms on the day of publicationterms_ini_favorite_count_total: total of favorites on the terms on the day of publicationterms_ini_favorite_count_mean: average of favorites on the terms on the day of publicationterms_ini_followers_talking_rate: ratio of followers of the media Twitter account who have recently published a tweet talking about the terms on the day of publicationterms_ini_user_num_followers_mean: average followers of users who have spoken of the terms on the day of publicationterms_ini_user_num_tweets_mean: average number of tweets published by users who spoke about the terms on the day of publicationterms_ini_user_age_mean: average age in days of users who have spoken of the terms on the day of publicationterms_ini_ur_inclusion_rate: URL inclusion ratio of tweets talking about terms on the day of publicationterms_end_num_tweets: total tweets on terms 14 days after publicationterms_ini_retweet_count_total: total retweets on terms 14 days after publicationterms_ini_retweet_count_mean: average retweets on terms 14 days after publicationterms_ini_favorite_count_total: total bookmarks on terms 14 days after publicationterms_ini_favorite_count_mean: average of favorites on terms 14 days after publicationterms_ini_followers_talking_rate: ratio of media Twitter account followers who have recently posted a tweet talking about the terms 14 days after publicationterms_ini_user_num_followers_mean: average followers of users who have spoken of the terms 14 days after publicationterms_ini_user_num_tweets_mean: average number of tweets published by users who have spoken about the terms 14 days after publicationterms_ini_user_age_mean: the average age in days of users who have spoken of the terms 14 days after publicationterms_ini_ur_inclusion_rate: URL inclusion ratio of tweets talking about terms 14 days after publication.tesis_terms: data of the terms (tags) related to the processed articles.stats_id: Analysis IDtime: "0" if at the time of publication, "1" if 14 days laterterm_id: Term ID (tag) in WordPressname: Name of the termslug: URL of the termnum_tweets: number of tweetsretweet_count_total: total retweetsretweet_count_mean: average retweetsfavorite_count_total: total of favoritesfavorite_count_mean: average of favoritesfollowers_talking_rate: ratio of followers of the media Twitter account who have recently published a tweet talking about the termuser_num_followers_mean: average followers of users who were talking about the termuser_num_tweets_mean: average number of tweets published by users who were talking about the termuser_age_mean: average age in days of users who were talking about the termurl_inclusion_rate: URL inclusion ratio
Google Trends - International
console.cloud.google.com
Updated May 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
https://console.cloud.google.com/marketplace/browse?filter=partner:BigQuery%20Public%20Datasets%20Program&hl=ja&inv=1&invt=Abz0QA (2023). Google Trends - International [Dataset]. https://console.cloud.google.com/marketplace/product/bigquery-public-datasets/google-trends-intl?hl=ja
Explore at:
Dataset updated
May 16, 2023
Dataset provided by
BigQueryhttps://cloud.google.com/bigquery
Google Searchhttp://google.com/
Googlehttp://google.com/
Description
The International Google Trends dataset will provide critical signals that individual users and businesses alike can leverage to make better data-driven decisions. This dataset simplifies the manual interaction with the existing Google Trends UI by automating and exposing anonymized, aggregated, and indexed search data in BigQuery. This dataset includes the Top 25 stories and Top 25 Rising queries from Google Trends. It will be made available as two separate BigQuery tables, with a set of new top terms appended daily. Each set of Top 25 and Top 25 rising expires after 30 days, and will be accompanied by a rolling five-year window of historical data for each country and region across the globe, where data is available. This Google dataset is hosted in Google BigQuery as part of Google Cloud's Datasets solution and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery
Worldwide digital population 2025
statista.com
ai-chatbox.pro
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Worldwide digital population 2025 [Dataset]. https://www.statista.com/statistics/617136/digital-population-worldwide/
Explore at:
Dataset updated
Apr 1, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Feb 2025
Area covered
World
Description
As of February 2025, 5.56 billion individuals worldwide were internet users, which amounted to 67.9 percent of the global population. Of this total, 5.24 billion, or 63.9 percent of the world's population, were social media users. Global internet usage Connecting billions of people worldwide, the internet is a core pillar of the modern information society. Northern Europe ranked first among worldwide regions by the share of the population using the internet in 20254. In The Netherlands, Norway and Saudi Arabia, 99 percent of the population used the internet as of February 2025. North Korea was at the opposite end of the spectrum, with virtually no internet usage penetration among the general population, ranking last worldwide. Eastern Asia was home to the largest number of online users worldwide – over 1.34 billion at the latest count. Southern Asia ranked second, with around 1.2 billion internet users. China, India, and the United States rank ahead of other countries worldwide by the number of internet users. Worldwide internet user demographics As of 2024, the share of female internet users worldwide was 65 percent, five percent less than that of men. Gender disparity in internet usage was bigger in African countries, with around a ten percent difference. Worldwide regions, like the Commonwealth of Independent States and Europe, showed a smaller usage gap between these two genders. As of 2024, global internet usage was higher among individuals between 15 and 24 years old across all regions, with young people in Europe representing the most significant usage penetration, 98 percent. In comparison, the worldwide average for the age group 15–24 years was 79 percent. The income level of the countries was also an essential factor for internet access, as 93 percent of the population of the countries with high income reportedly used the internet, as opposed to only 27 percent of the low-income markets.
cyclistic data case study 1
kaggle.com
Updated Dec 13, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nam Đại (2022). cyclistic data case study 1 [Dataset]. https://www.kaggle.com/datasets/namitra/cyclistic-data-case-study-1
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 13, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Nam Đại
Description
Dataset

This dataset was created by Nam Đại

Contents
DataForSEO Google Full (Keywords+SERP) database, historical data available
datarade.ai
.json, .csv
Updated Aug 17, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DataForSEO (2023). DataForSEO Google Full (Keywords+SERP) database, historical data available [Dataset]. https://datarade.ai/data-products/dataforseo-google-full-keywords-serp-database-historical-d-dataforseo
Explore at:
.json, .csvAvailable download formats
Dataset updated
Aug 17, 2023
Dataset provided by
Authors
DataForSEO
Area covered
Paraguay, United Kingdom, Côte d'Ivoire, Cyprus, Burkina Faso, South Africa, Bolivia (Plurinational State of), Sweden, Portugal, Costa Rica
Description
You can check the fields description in the documentation: current Full database: https://docs.dataforseo.com/v3/databases/google/full/?bash; Historical Full database: https://docs.dataforseo.com/v3/databases/google/history/full/?bash.

Full Google Database is a combination of the Advanced Google SERP Database and Google Keyword Database.

Google SERP Database offers millions of SERPs collected in 67 regions with most of Google’s advanced SERP features, including featured snippets, knowledge graphs, people also ask sections, top stories, and more.

Google Keyword Database encompasses billions of search terms enriched with related Google Ads data: search volume trends, CPC, competition, and more.

This database is available in JSON format only.

You don’t have to download fresh data dumps in JSON – we can deliver data straight to your storage or database. We send terrabytes of data to dozens of customers every month using Amazon S3, Google Cloud Storage, Microsoft Azure Blob, Eleasticsearch, and Google Big Query. Let us know if you’d like to get your data to any other storage or database.
USA Names
console.cloud.google.com
Updated Aug 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
https://console.cloud.google.com/marketplace/browse?filter=partner:U.S.%20Social%20Security%20Administration&hl=pt-BR&inv=1&invt=Ab4Asw (2023). USA Names [Dataset]. https://console.cloud.google.com/marketplace/product/social-security-administration/us-names?hl=pt-BR
Explore at:
Dataset updated
Aug 15, 2023
Dataset provided by
Googlehttp://google.com/
Area covered
United States
Description
This public dataset was created by the Social Security Administration and contains all names from Social Security card applications for births that occurred in the United States after 1879. Note that many people born before 1937 never applied for a Social Security card, so their names are not included in this data. For others who did apply, records may not show the place of birth, and again their names are not included in the data. All data are from a 100% sample of records on Social Security card applications as of the end of February 2015. To safeguard privacy, the Social Security Administration restricts names to those with at least 5 occurrences. This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery .
b
App Downloads Data (2025)
businessofapps.com
Updated Aug 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Business of Apps (2025). App Downloads Data (2025) [Dataset]. https://www.businessofapps.com/data/app-statistics/
Explore at:
Dataset updated
Aug 1, 2025
Dataset authored and provided by
Business of Apps
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
App Download Key StatisticsApp and Game DownloadsiOS App and Game DownloadsGoogle Play App and Game DownloadsGame DownloadsiOS Game DownloadsGoogle Play Game DownloadsApp DownloadsiOS App...
United States: number of internet users 2015-2025
statista.com
Updated Apr 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). United States: number of internet users 2015-2025 [Dataset]. https://www.statista.com/statistics/276445/number-of-internet-users-in-the-united-states/
Explore at:
Dataset updated
Apr 29, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
United States
Description
As of February 2025, around 322 million people in the United States accessed the internet, making it one of the largest online markets worldwide. The country currently ranks third after China and India by the online audience size. Overview of internet usage in the United States The digital population in the United States has constantly increased in recent years. Among the most common reasons is the growing accessibility of broadband internet. A big part of the country's digital audience accesses the web via mobile phones. In 2024, the country saw an estimated 97.1 percent mobile internet user penetration. According to a 2024 survey, over 51 percent of U.S. women and 43 percent of men said it is important to them to have mobile internet access anywhere, at any time. Another 41 percent of respondents could not imagine their everyday life without the internet. Google and YouTube are the most visited websites in the country, while music, food, and drinks were the most discussed online topics. Internet usage demographics in the United States While some users can no longer imagine their life without the internet, others do not use it at all. According to 2021 data, 25 percent of U.S. adults 65 and older reported not using the internet. Despite this, online usage was strong across other age groups, especially young adults aged 18 to 49. This age group also reported the highest percentage of smartphone usage in the country as of 2023. Due to a persistent lack of connectivity in rural areas, more online users were based in urban areas of the U.S. than in the countryside.
f
Datasheet1_Mobility data shows effectiveness of control strategies for...
frontiersin.figshare.com
pdf
Updated Mar 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yuval Berman; Shannon D. Algar; David M. Walker; Michael Small (2024). Datasheet1_Mobility data shows effectiveness of control strategies for COVID-19 in remote, sparse and diffuse populations.pdf [Dataset]. http://doi.org/10.3389/fepid.2023.1201810.s001
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.3389/fepid.2023.1201810.s001
Dataset updated
Mar 7, 2024
Dataset provided by
Frontiers
Authors
Yuval Berman; Shannon D. Algar; David M. Walker; Michael Small
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data that is collected at the individual-level from mobile phones is typically aggregated to the population-level for privacy reasons. If we are interested in answering questions regarding the mean, or working with groups appropriately modeled by a continuum, then this data is immediately informative. However, coupling such data regarding a population to a model that requires information at the individual-level raises a number of complexities. This is the case if we aim to characterize human mobility and simulate the spatial and geographical spread of a disease by dealing in discrete, absolute numbers. In this work, we highlight the hurdles faced and outline how they can be overcome to effectively leverage the specific dataset: Google COVID-19 Aggregated Mobility Research Dataset (GAMRD). Using a case study of Western Australia, which has many sparsely populated regions with incomplete data, we firstly demonstrate how to overcome these challenges to approximate absolute flow of people around a transport network from the aggregated data. Overlaying this evolving mobility network with a compartmental model for disease that incorporated vaccination status we run simulations and draw meaningful conclusions about the spread of COVID-19 throughout the state without de-anonymizing the data. We can see that towns in the Pilbara region are highly vulnerable to an outbreak originating in Perth. Further, we show that regional restrictions on travel are not enough to stop the spread of the virus from reaching regional Western Australia. The methods explained in this paper can be therefore used to analyze disease outbreaks in similarly sparse populations. We demonstrate that using this data appropriately can be used to inform public health policies and have an impact in pandemic responses.
Meta Kaggle Code
kaggle.com
zip
Updated Jul 31, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kaggle (2025). Meta Kaggle Code [Dataset]. https://www.kaggle.com/datasets/kaggle/meta-kaggle-code/code
Explore at:
zip(151045619431 bytes)Available download formats
Dataset updated
Jul 31, 2025
Dataset authored and provided by
Kagglehttp://kaggle.com/
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Explore our public notebook content!

Meta Kaggle Code is an extension to our popular Meta Kaggle dataset. This extension contains all the raw source code from hundreds of thousands of public, Apache 2.0 licensed Python and R notebooks versions on Kaggle used to analyze Datasets, make submissions to Competitions, and more. This represents nearly a decade of data spanning a period of tremendous evolution in the ways ML work is done.

Why we’re releasing this dataset

By collecting all of this code created by Kaggle’s community in one dataset, we hope to make it easier for the world to research and share insights about trends in our industry. With the growing significance of AI-assisted development, we expect this data can also be used to fine-tune models for ML-specific code generation tasks.

Meta Kaggle for Code is also a continuation of our commitment to open data and research. This new dataset is a companion to Meta Kaggle which we originally released in 2016. On top of Meta Kaggle, our community has shared nearly 1,000 public code examples. Research papers written using Meta Kaggle have examined how data scientists collaboratively solve problems, analyzed overfitting in machine learning competitions, compared discussions between Kaggle and Stack Overflow communities, and more.

The best part is Meta Kaggle enriches Meta Kaggle for Code. By joining the datasets together, you can easily understand which competitions code was run against, the progression tier of the code’s author, how many votes a notebook had, what kinds of comments it received, and much, much more. We hope the new potential for uncovering deep insights into how ML code is written feels just as limitless to you as it does to us!

Sensitive data

While we have made an attempt to filter out notebooks containing potentially sensitive information published by Kaggle users, the dataset may still contain such information. Research, publications, applications, etc. relying on this data should only use or report on publicly available, non-sensitive information.

Joining with Meta Kaggle

The files contained here are a subset of the KernelVersions in Meta Kaggle. The file names match the ids in the KernelVersions csv file. Whereas Meta Kaggle contains data for all interactive and commit sessions, Meta Kaggle Code contains only data for commit sessions.

File organization

The files are organized into a two-level directory structure. Each top level folder contains up to 1 million files, e.g. - folder 123 contains all versions from 123,000,000 to 123,999,999. Each sub folder contains up to 1 thousand files, e.g. - 123/456 contains all versions from 123,456,000 to 123,456,999. In practice, each folder will have many fewer than 1 thousand files due to private and interactive sessions.

The ipynb files in this dataset hosted on Kaggle do not contain the output cells. If the outputs are required, the full set of ipynbs with the outputs embedded can be obtained from this public GCS bucket: kaggle-meta-kaggle-code-downloads. Note that this is a "requester pays" bucket. This means you will need a GCP account with billing enabled to download. Learn more here: https://cloud.google.com/storage/docs/requester-pays

Questions / Comments

We love feedback! Let us know in the Discussion tab.

Happy Kaggling!
Mobile internet users worldwide 2020-2029
statista.com
Updated Feb 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista Research Department (2025). Mobile internet users worldwide 2020-2029 [Dataset]. https://www.statista.com/topics/779/mobile-internet/
Explore at:
Dataset updated
Feb 5, 2025
Dataset provided by
Statistahttp://statista.com/
Authors
Statista Research Department
Description
The global number of smartphone users in was forecast to continuously increase between 2024 and 2029 by in total 1.8 billion users (+42.62 percent). After the ninth consecutive increasing year, the smartphone user base is estimated to reach 6.1 billion users and therefore a new peak in 2029. Notably, the number of smartphone users of was continuously increasing over the past years.Smartphone users here are limited to internet users of any age using a smartphone. The shown figures have been derived from survey data that has been processed to estimate missing demographics.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of smartphone users in countries like Australia & Oceania and Asia.

Facebook

Twitter

Click to copy link

Link copied

Cite

DataSF (2019). San Francisco Open Data [Dataset]. https://www.kaggle.com/datasf/san-francisco

Data from: San Francisco Open Data

San Francisco Open Data (BigQuery Dataset)

Explore at:

zip(0 bytes)Available download formats

Dataset updated

Mar 20, 2019

Dataset authored and provided by

DataSF

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Area covered

San Francisco

Description

Context

DataSF seeks to transform the way that the City of San Francisco works -- through the use of data.

https://datasf.org/about/

Content

This dataset contains the following tables: ['311_service_requests', 'bikeshare_stations', 'bikeshare_status', 'bikeshare_trips', 'film_locations', 'sffd_service_calls', 'sfpd_incidents', 'street_trees']

This data includes all San Francisco 311 service requests from July 2008 to the present, and is updated daily. 311 is a non-emergency number that provides access to non-emergency municipal services.
This data includes fire unit responses to calls from April 2000 to present and is updated daily. Data contains the call number, incident number, address, unit identifier, call type, and disposition. Relevant time intervals are also included. Because this dataset is based on responses, and most calls involved multiple fire units, there are multiple records for each call number. Addresses are associated with a block number, intersection or call box.
This data includes incidents from the San Francisco Police Department (SFPD) Crime Incident Reporting system, from January 2003 until the present (2 weeks ago from current date). The dataset is updated daily. Please note: the SFPD has implemented a new system for tracking crime. This dataset is still sourced from the old system, which is in the process of being retired (a multi-year process).
This data includes a list of San Francisco Department of Public Works maintained street trees including: planting date, species, and location. Data includes 1955 to present.

This dataset is deprecated and not being updated.

Fork this kernel to get started with this dataset.

Acknowledgements

http://datasf.org/

Dataset Source: SF OpenData. This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - http://sfgov.org/ - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

Banner Photo by @meric from Unplash.

Inspiration

Which neighborhoods have the highest proportion of offensive graffiti?

Which complaint is most likely to be made using Twitter and in which neighborhood?

What are the most complained about Muni stops in San Francisco?

What are the top 10 incident types that the San Francisco Fire Department responds to?

How many medical incidents and structure fires are there in each neighborhood?

What’s the average response time for each type of dispatched vehicle?

Which category of police incidents have historically been the most common in San Francisco?

What were the most common police incidents in the category of LARCENY/THEFT in 2016?

Which non-criminal incidents saw the biggest reporting change from 2015 to 2016?

What is the average tree diameter?

What is the highest number of a particular species of tree planted in a single year?

Which San Francisco locations feature the largest number of trees?

Clear search

Close search

Google apps

Main menu

Data from: San Francisco Open Data

Context

Content

Acknowledgements

Inspiration

Trust level French people have in Google and Facebook to ensure data...

Global market share of leading desktop search engines 2015-2025

Google Ads Transparency Center

Google SERP Data, Web Search Data, Google Images Data | Real-Time API

Google Drive Statistics And Facts (2025)

Introduction

USA Name Data

Context

Content

Acknowledgements

Inspiration

Online Learning Global Queries Dataset: A Comprehensive Dataset of What...

Frequency of using Google Docs in the U.S. 2018

Data from: Google Analytics & Twitter dataset from a movies, TV series and...

Google Trends - International

Worldwide digital population 2025

cyclistic data case study 1

Dataset

Contents

DataForSEO Google Full (Keywords+SERP) database, historical data available

USA Names

App Downloads Data (2025)

United States: number of internet users 2015-2025

Datasheet1_Mobility data shows effectiveness of control strategies for...

Meta Kaggle Code

Explore our public notebook content!

Why we’re releasing this dataset

Sensitive data

Joining with Meta Kaggle

File organization

Questions / Comments

Mobile internet users worldwide 2020-2029

Data from: San Francisco Open Data

San Francisco Open Data (BigQuery Dataset)

Context

Content

Acknowledgements

Inspiration