21 datasets found
  1. Data from: San Francisco Open Data

    • kaggle.com
    zip
    Updated Mar 20, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DataSF (2019). San Francisco Open Data [Dataset]. https://www.kaggle.com/datasets/datasf/san-francisco
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Mar 20, 2019
    Dataset authored and provided by
    DataSF
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    San Francisco
    Description

    Context

    DataSF seeks to transform the way that the City of San Francisco works -- through the use of data.

    https://datasf.org/about/

    Content

    This dataset contains the following tables: ['311_service_requests', 'bikeshare_stations', 'bikeshare_status', 'bikeshare_trips', 'film_locations', 'sffd_service_calls', 'sfpd_incidents', 'street_trees']

    • This data includes all San Francisco 311 service requests from July 2008 to the present, and is updated daily. 311 is a non-emergency number that provides access to non-emergency municipal services.
    • This data includes fire unit responses to calls from April 2000 to present and is updated daily. Data contains the call number, incident number, address, unit identifier, call type, and disposition. Relevant time intervals are also included. Because this dataset is based on responses, and most calls involved multiple fire units, there are multiple records for each call number. Addresses are associated with a block number, intersection or call box.
    • This data includes incidents from the San Francisco Police Department (SFPD) Crime Incident Reporting system, from January 2003 until the present (2 weeks ago from current date). The dataset is updated daily. Please note: the SFPD has implemented a new system for tracking crime. This dataset is still sourced from the old system, which is in the process of being retired (a multi-year process).
    • This data includes a list of San Francisco Department of Public Works maintained street trees including: planting date, species, and location. Data includes 1955 to present.

    This dataset is deprecated and not being updated.

    Fork this kernel to get started with this dataset.

    Acknowledgements

    http://datasf.org/

    Dataset Source: SF OpenData. This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - http://sfgov.org/ - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

    Banner Photo by @meric from Unplash.

    Inspiration

    Which neighborhoods have the highest proportion of offensive graffiti?

    Which complaint is most likely to be made using Twitter and in which neighborhood?

    What are the most complained about Muni stops in San Francisco?

    What are the top 10 incident types that the San Francisco Fire Department responds to?

    How many medical incidents and structure fires are there in each neighborhood?

    What’s the average response time for each type of dispatched vehicle?

    Which category of police incidents have historically been the most common in San Francisco?

    What were the most common police incidents in the category of LARCENY/THEFT in 2016?

    Which non-criminal incidents saw the biggest reporting change from 2015 to 2016?

    What is the average tree diameter?

    What is the highest number of a particular species of tree planted in a single year?

    Which San Francisco locations feature the largest number of trees?

  2. NOAA GSOD

    • kaggle.com
    zip
    Updated Aug 30, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NOAA (2019). NOAA GSOD [Dataset]. https://www.kaggle.com/datasets/noaa/gsod
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Aug 30, 2019
    Dataset provided by
    National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
    Authors
    NOAA
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Overview

    Global Surface Summary of the Day is derived from The Integrated Surface Hourly (ISH) dataset. The ISH dataset includes global data obtained from the USAF Climatology Center, located in the Federal Climate Complex with NCDC. The latest daily summary data are normally available 1-2 days after the date-time of the observations used in the daily summaries.

    Content

    Over 9000 stations' data are typically available.

    The daily elements included in the dataset (as available from each station) are: Mean temperature (.1 Fahrenheit) Mean dew point (.1 Fahrenheit) Mean sea level pressure (.1 mb) Mean station pressure (.1 mb) Mean visibility (.1 miles) Mean wind speed (.1 knots) Maximum sustained wind speed (.1 knots) Maximum wind gust (.1 knots) Maximum temperature (.1 Fahrenheit) Minimum temperature (.1 Fahrenheit) Precipitation amount (.01 inches) Snow depth (.1 inches)

    Indicator for occurrence of: Fog, Rain or Drizzle, Snow or Ice Pellets, Hail, Thunder, Tornado/Funnel

    Querying BigQuery tables

    You can use the BigQuery Python client library to query tables in this dataset in Kernels. Note that methods available in Kernels are limited to querying data. Tables are at bigquery-public-data.github_repos.[TABLENAME]. Fork this kernel to get started to learn how to safely manage analyzing large BigQuery datasets.

    Acknowledgements

    This public dataset was created by the National Oceanic and Atmospheric Administration (NOAA) and includes global data obtained from the USAF Climatology Center. This dataset covers GSOD data between 1929 and present, collected from over 9000 stations. Dataset Source: NOAA

    Use: This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source — http://www.data.gov/privacy-policy#data_policy — and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

    Photo by Allan Nygren on Unsplash

  3. ChatGPT reviews [DAILY UPDATED]

    • kaggle.com
    Updated Jan 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The citation is currently not available for this dataset.
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 10, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Ashish Kumar
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This dataset mainly consists of daily-updated user reviews and ratings for the ChatGPT Android App. It also contains data on the relevancy of these reviews and the dates they were posted.

  4. e

    Coronavirus (COVID-19) Mobility Report

    • data.europa.eu
    unknown
    Updated Mar 17, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chris Fairless (2021). Coronavirus (COVID-19) Mobility Report [Dataset]. https://data.europa.eu/data/datasets/coronavirus-covid-19-mobility-report-2?locale=en
    Explore at:
    unknownAvailable download formats
    Dataset updated
    Mar 17, 2021
    Dataset authored and provided by
    Chris Fairless
    Description

    Due to changes in the collection and availability of data on COVID-19, this website will no longer be updated. The webpage will no longer be available as of 11 May 2023. On-going, reliable sources of data for COVID-19 are available via the COVID-19 dashboard and the UKHSA

    GLA Covid-19 Mobility Report

    Since March 2020, London has seen many different levels of restrictions - including three separate lockdowns and many other tiers/levels of restrictions, as well as easing of restrictions and even measures to actively encourage people to go to work, their high streets and local restaurants. This reports gathers data from a number of sources, including google, apple, citymapper, purple wifi and opentable to assess the extent to which these levels of restrictions have translated to a reductions in Londoners' movements.

    The data behind the charts below come from different sources. None of these data represent a direct measure of how well people are adhering to the lockdown rules - nor do they provide an exhaustive data set. Rather, they are measures of different aspects of mobility, which together, offer an overall impression of how people Londoners are moving around the capital. The information is broken down by use of public transport, pedestrian activity, retail and leisure, and homeworking.

    Public Transport

    For the transport measures, we have included data from google, Apple, CityMapper and Transport for London. They measure different aspects of public transport usage - depending on the data source. Each of the lines in the chart below represents a percentage of a pre-pandemic baseline.

    https://cdn.datapress.cloud/london/img/dataset/60e5834b-68aa-48d7-a8c5-7ee4781bde05/2025-06-09T20%3A54%3A15/6b096426c4c582dc9568ed4830b4226d.webp" alt="Embedded Image" />

    activity Source Latest Baseline Min value in Lockdown 1 Min value in Lockdown 2 Min value in Lockdown 3 Citymapper Citymapper mobility index 2021-09-05 Compares trips planned and trips taken within its app to a baseline of the four weeks from 6 Jan 2020 7.9% 28% 19% Google Google Mobility Report 2022-10-15 Location data shared by users of Android smartphones, compared time and duration of visits to locations to the median values on the same day of the week in the five weeks from 3 Jan 2020 20.4% 40% 27% TfL Bus Transport for London 2022-10-30 Bus journey ‘taps' on the TfL network compared to same day of the week in four weeks starting 13 Jan 2020 - 34% 24% TfL Tube Transport for London 2022-10-30 Tube journey ‘taps' on the TfL network compared to same day of the week in four weeks starting 13 Jan 2020 - 30% 21% Pedestrian activity

    With the data we currently have it's harder to estimate pedestrian activity and high street busyness. A few indicators can give us information on how people are making trips out of the house:

    https://cdn.datapress.cloud/london/img/dataset/60e5834b-68aa-48d7-a8c5-7ee4781bde05/2025-06-09T20%3A54%3A15/bcf082c07e4d7ff5202012f0a97abc3a.webp" alt="Embedded Image" />

    activity Source Latest Baseline Min value in Lockdown 1 Min value in Lockdown 2 Min value in Lockdown 3 Walking Apple Mobility Index 2021-11-09 estimates the frequency of trips made on foot compared to baselie of 13 Jan '20 22% 47% 36% Parks Google Mobility Report 2022-10-15 Frequency of trips to parks. Changes in the weather mean this varies a lot. Compared to baseline of 5 weeks from 3 Jan '20 30% 55% 41% Retail & Rec Google Mobility Report 2022-10-15 Estimates frequency of trips to shops/leisure locations. Compared to baseline of 5 weeks from 3 Jan '20 30% 55% 41% Retail and recreation

    In this section, we focus on estimated footfall to shops, restaurants, cafes, shopping centres and so on.

    https://cdn.datapress.cloud/london/img/dataset/60e5834b-68aa-48d7-a8c5-7ee4781bde05/2025-06-09T20%3A54%3A16/b62d60f723eaafe64a989e4afec4c62b.webp" alt="Embedded Image" />

    activity Source Latest Baseline Min value in Lockdown 1 Min value in Lockdown 2 Min value in Lockdown 3 Grocery/pharmacy Google Mobility Report 2022-10-15 Estimates frequency of trips to grovery shops and pharmacies. Compared to baseline of 5 weeks from 3 Jan '20 32% 55.00% 45.000% Retail/rec <a href="https://ww

  5. Day & night temperatures, 50yrs, 1666ws, TFRecord

    • kaggle.com
    zip
    Updated Nov 9, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Martin Görner (2019). Day & night temperatures, 50yrs, 1666ws, TFRecord [Dataset]. https://www.kaggle.com/datasets/mgorner/day-night-temperatures-50yrs-1666ws-tfrecord
    Explore at:
    zip(160157825 bytes)Available download formats
    Dataset updated
    Nov 9, 2019
    Authors
    Martin Görner
    License

    https://www.usa.gov/government-works/https://www.usa.gov/government-works/

    Description

    This dataset is a cleaned-up extract from the following public BigQuery dataset: https://console.cloud.google.com/marketplace/details/noaa-public/ghcn-d

    The dataset contains daily min/max temperatures from a selection of 1666 weather stations. The data spans exactly 50 years. Missing values have been interpolated and are marked as such.

    This dataset is in TFRecord format.

    About the original dataset: NOAA’s Global Historical Climatology Network (GHCN) is an integrated database of climate summaries from land surface stations across the globe that have been subjected to a common suite of quality assurance reviews. The data are obtained from more than 20 sources. The GHCN-Daily is an integrated database of daily climate summaries from land surface stations across the globe, and is comprised of daily climate records from over 100,000 stations in 180 countries and territories, and includes some data from every year since 1763.

  6. Google Safe Browsing Transparency Report Data

    • kaggle.com
    Updated Nov 8, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rob Rose (2019). Google Safe Browsing Transparency Report Data [Dataset]. http://doi.org/10.34740/kaggle/dsv/784868
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 8, 2019
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Rob Rose
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Context

    I wanted to make this for potentially using as a helper dataset in the Microsoft Malware Prediction competition. I was also inspired by Kaggle's new ability to create datasets from the outputs of Kernels, which is something I leveraged here.

    Content

    The data is the full data found on the Google Safe Browsing Transparency Report web page. There is plenty of missing data, sometimes the source data doesn't start for a while and there are periodic gaps for unspecified reasons. It's up to you to determine what to do with those gaps. The reinfection rate has been multiplied by 100 and converted to an int in order to signify percentage.

    Acknowledgements

    Thanks to @rquintino for publishing the splits for the Microsoft competition that originally inspired me to gather this data. And @cdeotte who originally published some scraped datasets in the Microsoft competition, see this discussion post for details.

    Inspiration

    I hope some people find this useful! For the Microsoft challenge or any future challenges! Please leave an upvote here or on the source kernel if you found it useful! I plan to rerun the source kernel weekly on Fridays. I hope Kaggle in the future enables some way to automate that, but for now I just do it manually. If the data is stale, feel free to ping me in the discussions section or on the source kernel and I'll run it.

  7. Chicago Crime

    • kaggle.com
    zip
    Updated Apr 17, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Chicago (2018). Chicago Crime [Dataset]. https://www.kaggle.com/datasets/chicago/chicago-crime
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Apr 17, 2018
    Dataset authored and provided by
    City of Chicago
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    Chicago
    Description

    Context

    Approximately 10 people are shot on an average day in Chicago.

    http://www.chicagotribune.com/news/data/ct-shooting-victims-map-charts-htmlstory.html http://www.chicagotribune.com/news/local/breaking/ct-chicago-homicides-data-tracker-htmlstory.html http://www.chicagotribune.com/news/local/breaking/ct-homicide-victims-2017-htmlstory.html

    Content

    This dataset reflects reported incidents of crime (with the exception of murders where data exists for each victim) that occurred in the City of Chicago from 2001 to present, minus the most recent seven days. Data is extracted from the Chicago Police Department's CLEAR (Citizen Law Enforcement Analysis and Reporting) system. In order to protect the privacy of crime victims, addresses are shown at the block level only and specific locations are not identified. This data includes unverified reports supplied to the Police Department. The preliminary crime classifications may be changed at a later date based upon additional investigation and there is always the possibility of mechanical or human error. Therefore, the Chicago Police Department does not guarantee (either expressed or implied) the accuracy, completeness, timeliness, or correct sequencing of the information and the information should not be used for comparison purposes over time.

    Update Frequency: Daily

    Fork this kernel to get started.

    Acknowledgements

    https://bigquery.cloud.google.com/dataset/bigquery-public-data:chicago_crime

    https://cloud.google.com/bigquery/public-data/chicago-crime-data

    Dataset Source: City of Chicago

    This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source —https://data.cityofchicago.org — and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

    Banner Photo by Ferdinand Stohr from Unplash.

    Inspiration

    What categories of crime exhibited the greatest year-over-year increase between 2015 and 2016?

    Which month generally has the greatest number of motor vehicle thefts?

    How does temperature affect the incident rate of violent crime (assault or battery)?

    https://cloud.google.com/bigquery/images/chicago-scatter.png" alt=""> https://cloud.google.com/bigquery/images/chicago-scatter.png

  8. Google Stock History

    • kaggle.com
    Updated Oct 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PavanKalyan (2023). Google Stock History [Dataset]. https://www.kaggle.com/pavan9065/google-stock-history
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 25, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    PavanKalyan
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    Google, one of the greatest gifts to mankind. Any information that you need today is available on Google. Google is a household name and literally, everyone is aware of what Google is. It helps you get resources for your school projects, helps you shop online and much more. Google has made getting an education a lot easier for people across the globe. No matter where you are, you can access google provided you have internet. Every piece of info is available on google and it's all one click away. But Google has a parent company known as Alphabet Inc. that trades and here we have stock data from A Alphabet Inc.

    Content

    This data set has 7 columns with all the necessary values such as the opening price of the stock, the closing price of it, its highest in the day and much more. It has date wise data of the stock starting from 2004 to 2023(October).

  9. GOOGLE MOBILITY DATA

    • kaggle.com
    Updated Oct 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AiswaryaRamachandran (2025). GOOGLE MOBILITY DATA [Dataset]. https://www.kaggle.com/aiswaryaramachandran/google-mobility-data/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 5, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    AiswaryaRamachandran
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    As global communities respond to COVID-19, we've heard from public health officials that the same type of aggregated, anonymized insights we use in products such as Google Maps could be helpful as they make critical decisions to combat COVID-19.

    These Community Mobility Reports aim to provide insights into what has changed in response to policies aimed at combating COVID-19. The reports chart movement trends over time by geography, across different categories of places such as retail and recreation, groceries and pharmacies, parks, transit stations, workplaces, and residential. (https://www.google.com/covid19/mobility/)

    Content

    The data contains aggregated and anonymised aggregated data per day for each country. For say accessing data for India - the files 2020_IN_Region_Mobility_Report.csv for 2020 data and 2021_IN_Region_Mobility_Report.csv. The aggregated data is not only present at country level, but also at States and district level - as given in sub_region_1 and sub_region_2.

    Acknowledgements

    This data from report published by Google. https://www.google.com/covid19/mobility/

    Inspiration

    Some Questions to answer

    1. India is having its Second Wave and one of the major causes is considered to the election rallies held in different parts of the country. How does Mobility Impact the COVID Cases?

    2. Comparing Mobility across different Countries

  10. COVID19 - The New York Times

    • kaggle.com
    zip
    Updated May 18, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Google BigQuery (2020). COVID19 - The New York Times [Dataset]. https://www.kaggle.com/bigquery/covid19-nyt
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    May 18, 2020
    Dataset provided by
    BigQueryhttps://cloud.google.com/bigquery
    Authors
    Google BigQuery
    Description

    Context

    This is the US Coronavirus data repository from The New York Times . This data includes COVID-19 cases and deaths reported by state and county. The New York Times compiled this data based on reports from state and local health agencies. More information on the data repository is available here . For additional reporting and data visualizations, see The New York Times’ U.S. coronavirus interactive site

    Sample Queries

    Query 1

    Which US counties have the most confirmed cases per capita? This query determines which counties have the most cases per 100,000 residents. Note that this may differ from similar queries of other datasets because of differences in reporting lag, methodologies, or other dataset differences.

    SELECT covid19.county, covid19.state_name, total_pop AS county_population, confirmed_cases, ROUND(confirmed_cases/total_pop *100000,2) AS confirmed_cases_per_100000, deaths, ROUND(deaths/total_pop *100000,2) AS deaths_per_100000 FROM bigquery-public-data.covid19_nyt.us_counties covid19 JOIN bigquery-public-data.census_bureau_acs.county_2017_5yr acs ON covid19.county_fips_code = acs.geo_id WHERE date = DATE_SUB(CURRENT_DATE(),INTERVAL 1 day) AND covid19.county_fips_code != "00000" ORDER BY confirmed_cases_per_100000 desc

    Query 2

    How do I calculate the number of new COVID-19 cases per day? This query determines the total number of new cases in each state for each day available in the dataset SELECT b.state_name, b.date, MAX(b.confirmed_cases - a.confirmed_cases) AS daily_confirmed_cases FROM (SELECT state_name AS state, state_fips_code , confirmed_cases, DATE_ADD(date, INTERVAL 1 day) AS date_shift FROM bigquery-public-data.covid19_nyt.us_states WHERE confirmed_cases + deaths > 0) a JOIN bigquery-public-data.covid19_nyt.us_states b ON a.state_fips_code = b.state_fips_code AND a.date_shift = b.date GROUP BY b.state_name, date ORDER BY date desc

  11. n

    Data from: Recognizing the importance of near-home contact with nature for...

    • data.niaid.nih.gov
    • datasetcatalog.nlm.nih.gov
    • +2more
    zip
    Updated Aug 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Magdalena Lenda; Piotr Skórka; Małgorzata Jaźwa; Hsien-Yung Lin; Edward Nęcka; Piotr Tryjanowski; Dawid Moroń; Johannes M. H. Knops; Hugh P. Possingham (2023). Recognizing the importance of near-home contact with nature for mental well-being based on the COVID-19 lockdown experience [Dataset]. http://doi.org/10.5061/dryad.fn2z34v1h
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 29, 2023
    Dataset provided by
    The University of Queensland
    Institute of Nature Conservation
    University of Opole
    Uniwersytet SWPS
    Institute of Systematics and Evolution of Animals
    Carleton University
    University of Life Sciences in Poznań
    Xi’an Jiaotong-Liverpool University
    Authors
    Magdalena Lenda; Piotr Skórka; Małgorzata Jaźwa; Hsien-Yung Lin; Edward Nęcka; Piotr Tryjanowski; Dawid Moroń; Johannes M. H. Knops; Hugh P. Possingham
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Several urban landscape planning solutions have been introduced around the world to find a balance between developing urban spaces, maintaining and restoring biodiversity, and enhancing quality of human life. Our global mini-review, combined with analysis of big data collected from Google Trends at global scale, reveals the importance of enjoying day-to-day contact with nature and engaging in such activities as nature observation and identification and gardening for the mental well-being of humans during the COVID-19 pandemic. Home-based activities, such as watching birds from one’s window, identifying species of plants and animals, backyard gardening, and collecting information about nature for citizen science projects, were popular during the first lockdown in spring 2020, when people could not easily venture out of their homes. In our mini-review, we found 37 articles from 28 countries with a total sample of 114,466 people. These papers suggest that home-based engagement with nature was an entertaining and pleasant distraction that helped preserve mental well-being during a challenging time. According to Google Trends, interest in such activities increased during lockdown compared to the previous five years. Millions of people worldwide are chronically or temporarily confined to their homes and neighborhoods because of illness, childcare chores, or elderly care responsibility, which makes it difficult for them to travel far to visit such places as national parks, created through land sparing, where people go to enjoy nature and relieve stress. This article posits that for such people, living in an urban landscape designed to facilitate effortless contact with small natural areas is a more effective way to receive the mental health benefits of contact with nature than visiting a sprawling nature park on rare occasions. Methods 1. Identifying the most common types of activities related to nature observation, gardening, and taxa identification during the first lockdown based on scientific articles and non-scientific press For scientific articles, in March 2023 we searched Scopus and Google Scholar. For countries where Google is restricted, such as China, similar results will be available from other scientific browsers, with the highest number of results from our database being available from Scopus. We used the Google Search browser to search for globally published non-scientific press articles. Some selection criteria were applied during article review. Specifically, we excluded articles that were not about the first lockdown; did not study activities at a local scale (from balcony, window, backyard) but rather in areas far away from home (e.g., visiting forests); studied the mental health effect of observing indoor potted plants and pet animals; or transiently mentioned the topic or keyword without going into any scientific detail. We included all papers that met our criteria, that is, studies that analyzed our chosen topic with experiments or planned observations. We included all research papers, but not letters that made claims without any data. Google Scholar automatically screened the title, abstract, keywords, and the whole text of each article for the keywords we entered. All articles that met our criteria were read and double-checked for keywords and content related to the keywords (e.g., synonyms or if they presented content about the relevant topic without using the specific keywords). We identified, from both types of articles, the major nature-based activities that people engaged in during the first lockdown in the spring of 2020. Keywords used in this study were grouped into six main topics: (1) COVID-19 pandemic; (2) nature-oriented activity focused on nature observation, identification of different taxa, or gardening; (3) mental well-being; (4) activities performed from a balcony, window, or in gardens; (5) entertainment; and (6) citizen science (see Table 1 for all keywords). 2. Increase in global trends in interest in nature observation, gardening, and taxa identification during the first lockdown We used the categorical cluster method, which was combined with big data from Google Trends (downloaded on 1 September 2020) and anomaly detection to identify trend anomalies globally in peoples’ interests. We used this combination of methods to examine whether interest in nature-based activities that were mentioned in scientific and nonscientific press articles increased during the first lockdown. Keywords linked with the main types of nature-oriented activities, as identified from press and scientific articles, and used according to the categorical clustering method were classified into the following six main categories: (1) global interest in bird-watching and bird identification combined with citizen science; (2) global interest in plant identification and gardening combined with citizen science; (3) global interest in butterfly watching, (4) local interest in early-spring (lockdown time), summer, or autumn flowering species that usually can be found in Central European (country: Poland) backyards; (5) global interest in traveling and social activities; and (6) global interest in nature areas and activities typically enjoyed during holidays and thus requiring traveling to land-spared nature reserves. The six categories were divided into 15 subcategories so that we could attach relevant words or phrases belonging to the same cluster and typically related to the activity (according to Google Trends and Google browser’s automatic suggestions; e.g., people who searched for “bird-watching” typically also searched for “binoculars,” “bird feeder,” “bird nest,” and “birdhouse”). The subcategories and keywords used for data collection about trends in society’s interest in the studied topic from Google Trends are as follows.

    Bird-watching: “binoculars,” “bird feeder,” “bird nest,” “birdhouse,” “bird-watching”; Bird identification: “bird app,” “bird identification,” “bird identification app,” “bird identifier,” “bird song app”; Bird-watching combined with citizen science: “bird guide,” “bird identification,” “eBird,” “feeding birds,” “iNaturalist”; Citizen science and bird-watching apps: “BirdNET,” “BirdSong ID,” “eBird,” “iNaturalist,” “Merlin Bird ID”; Gardening: “gardening,” “planting,” “seedling,” “seeds,” “soil”; Shopping for gardening: “garden shop,” “plant buy,” “plant ebay,” “plant sell,” “plant shop”; Plant identification apps: “FlowerChecker,” “LeafSnap,” “NatureGate,” “Plantifier,” “PlantSnap”; Citizen science and plant identification: “iNaturalist,” “plant app,” “plant check,” “plant identification app,” “plant identifier”; Flowers that were flowering in gardens during lockdown in Poland: “fiołek” (viola), “koniczyna” (shamrock), “mlecz” (dandelion), “pierwiosnek” (primose), “stokrotka” (daisy). They are typical early-spring flowers growing in the gardens in Central Europe. We had to be more specific in this search because there are no plant species blooming across the world at the same time. These plant species have well-known biology; thus, we could easily interpret these results; Flowers that were not flowering during lockdown in Poland: “chaber” (cornflower), “mak” (poppy), “nawłoć” (goldenrod), “róża” (rose), “rumianek” (chamomile). They are typical mid-summer flowering plants often planted in gardens; Interest in traveling long distances and in social activities that involve many people: “airport,” “bus,” “café,” “driving,” “pub”; Single or mass commuting, and traveling: “bike,” “boat,” “car,” “flight,” “train”; Interest in distant places and activities for visiting natural areas: “forest,” “nature park,” “safari,” “trekking,” “trip”; Places and activities for holidays (typically located far away): “coral reef,” “rainforest,” “safari,” “savanna,” “snorkeling”; Butterfly watching: “butterfly watching,” “butterfly identification,” “butterfly app,” “butterfly net,” “butterfly guide”;

    In Google Trends, we set the following filters: global search, dates: July 2016–July 2020; language: English.

  12. Inflation Drives People to Google Negative Concepts (Forecast)

    • kappasignal.com
    Updated Jun 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    KappaSignal (2023). Inflation Drives People to Google Negative Concepts (Forecast) [Dataset]. https://www.kappasignal.com/2023/06/inflation-drives-people-to-google.html
    Explore at:
    Dataset updated
    Jun 11, 2023
    Dataset authored and provided by
    KappaSignal
    License

    https://www.kappasignal.com/p/legal-disclaimer.htmlhttps://www.kappasignal.com/p/legal-disclaimer.html

    Description

    This analysis presents a rigorous exploration of financial data, incorporating a diverse range of statistical features. By providing a robust foundation, it facilitates advanced research and innovative modeling techniques within the field of finance.

    Inflation Drives People to Google Negative Concepts

    Financial data:

    • Historical daily stock prices (open, high, low, close, volume)

    • Fundamental data (e.g., market capitalization, price to earnings P/E ratio, dividend yield, earnings per share EPS, price to earnings growth, debt-to-equity ratio, price-to-book ratio, current ratio, free cash flow, projected earnings growth, return on equity, dividend payout ratio, price to sales ratio, credit rating)

    • Technical indicators (e.g., moving averages, RSI, MACD, average directional index, aroon oscillator, stochastic oscillator, on-balance volume, accumulation/distribution A/D line, parabolic SAR indicator, bollinger bands indicators, fibonacci, williams percent range, commodity channel index)

    Machine learning features:

    • Feature engineering based on financial data and technical indicators

    • Sentiment analysis data from social media and news articles

    • Macroeconomic data (e.g., GDP, unemployment rate, interest rates, consumer spending, building permits, consumer confidence, inflation, producer price index, money supply, home sales, retail sales, bond yields)

    Potential Applications:

    • Stock price prediction

    • Portfolio optimization

    • Algorithmic trading

    • Market sentiment analysis

    • Risk management

    Use Cases:

    • Researchers investigating the effectiveness of machine learning in stock market prediction

    • Analysts developing quantitative trading Buy/Sell strategies

    • Individuals interested in building their own stock market prediction models

    • Students learning about machine learning and financial applications

    Additional Notes:

    • The dataset may include different levels of granularity (e.g., daily, hourly)

    • Data cleaning and preprocessing are essential before model training

    • Regular updates are recommended to maintain the accuracy and relevance of the data

  13. f

    Data from: Fine-Scale Spatiotemporal Air Pollution Analysis Using Mobile...

    • tandf.figshare.com
    • datasetcatalog.nlm.nih.gov
    pdf
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yawen Guan; Margaret C. Johnson; Matthias Katzfuss; Elizabeth Mannshardt; Kyle P. Messier; Brian J. Reich; Joon J. Song (2023). Fine-Scale Spatiotemporal Air Pollution Analysis Using Mobile Monitors on Google Street View Vehicles [Dataset]. http://doi.org/10.6084/m9.figshare.10113239.v3
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Yawen Guan; Margaret C. Johnson; Matthias Katzfuss; Elizabeth Mannshardt; Kyle P. Messier; Brian J. Reich; Joon J. Song
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    People are increasingly concerned with understanding their personal environment, including possible exposure to harmful air pollutants. To make informed decisions on their day-to-day activities, they are interested in real-time information on a localized scale. Publicly available, fine-scale, high-quality air pollution measurements acquired using mobile monitors represent a paradigm shift in measurement technologies. A methodological framework utilizing these increasingly fine-scale measurements to provide real-time air pollution maps and short-term air quality forecasts on a fine-resolution spatial scale could prove to be instrumental in increasing public awareness and understanding. The Google Street View study provides a unique source of data with spatial and temporal complexities, with the potential to provide information about commuter exposure and hot spots within city streets with high traffic. We develop a computationally efficient spatiotemporal model for these data and use the model to make short-term forecasts and high-resolution maps of current air pollution levels. We also show via an experiment that mobile networks can provide more nuanced information than an equally sized fixed-location network. This modeling framework has important real-world implications in understanding citizens’ personal environments, as data production and real-time availability continue to be driven by the ongoing development and improvement of mobile measurement technologies. Supplementary materials for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplement.

  14. Data from: Novel Corona Virus 2019 Dataset

    • kaggle.com
    zip
    Updated Jan 30, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SRK (2020). Novel Corona Virus 2019 Dataset [Dataset]. https://www.kaggle.com/sudalairajkumar/novel-corona-virus-2019-dataset
    Explore at:
    zip(3155 bytes)Available download formats
    Dataset updated
    Jan 30, 2020
    Authors
    SRK
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    From World Health Organization - On 31 December 2019, WHO was alerted to several cases of pneumonia in Wuhan City, Hubei Province of China. The virus did not match any other known virus. This raised concern because when a virus is new, we do not know how it affects people.

    So daily level information on the affected people can give some interesting insights when it is made available to the broader data science community.

    Johns Hopkins University has made an excellent dashboard using the affected cases data. This data is extracted from the same link and made available in csv format.

    Content

    2019 Novel Coronavirus (2019-nCoV) is a virus (more specifically, a coronavirus) identified as the cause of an outbreak of respiratory illness first detected in Wuhan, China. Early on, many of the patients in the outbreak in Wuhan, China reportedly had some link to a large seafood and animal market, suggesting animal-to-person spread. However, a growing number of patients reportedly have not had exposure to animal markets, indicating person-to-person spread is occurring. At this time, it’s unclear how easily or sustainably this virus is spreading between people - CDC

    This dataset has daily level information on the number of affected cases, deaths and recovery from 2019 novel coronavirus.

    The data is available from 22 Jan 2020.

    Acknowledgements

    Johns Hopkins university has made the data available in google sheets format here. Sincere thanks to them.

    Thanks to WHO, CDC, NHC and DXY for making the data available in first place.

    Picture courtesy : Johns Hopkins University dashboard

    Inspiration

    Some insights could be

    1. Changes in number of affected cases over time
    2. Change in cases over time at country level
    3. Latest number of affected cases
  15. Bellabeat Case Study Capstone Steps vs Sleep

    • kaggle.com
    Updated Jun 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brennan Grout (2022). Bellabeat Case Study Capstone Steps vs Sleep [Dataset]. https://www.kaggle.com/datasets/brennangrout/bellabeat-case-study-capstone-steps-vs-sleep
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 13, 2022
    Dataset provided by
    Kaggle
    Authors
    Brennan Grout
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The data set and tools can be found at the GitHub link here:https://github.com/groutbrennan/cleaning-data-with-r/tree/master/capstone_data/working_data

    This dataset contains: - Data - R markdown - R analysis and cleaning scripts - Final gpplot scatterplot viz image

    This dataset was created as part of the Google data analysis course presented by Coursera comparing how people use their smart devices to track their daily health.

    After reviewing the initial data, my hypothesis was people who walk more sleep longer.

    However after cleaning, transforming, and analyzing the data, I found people who took more steps during the day actually slept less total minutes than people who took lesser steps. After this conclusion I found there was a correlation between more steps taken during the day and less minutes need to sleep at night. However, I don't have proof that this is the causation. Further research will need to be done to confirm that this is the case.

  16. Atlanta Crime Data 2009 - Present

    • kaggle.com
    Updated Dec 11, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Peng Chen charles (2020). Atlanta Crime Data 2009 - Present [Dataset]. https://www.kaggle.com/pengchencharles/atlanta-crime-data2020/tasks
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 11, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Peng Chen charles
    Area covered
    Atlanta
    Description

    Context

    A majority of crime happened at Downtown and Midtown in 2020.

    Content

    This dataset reflects reported incidents of crime that occurred in the City of Atlanta from 2009 to present. Data is extracted from Atlanta Police Department's official website. This data includes unverified reports supplied to the Police Department. The preliminary crime classifications may be changed at a later date based upon additional investigation and there is always the possibility of mechanical or human error. Therefore, Atlanta Police Department does not guarantee (either expressed or implied) the accuracy, completeness, timeliness, or correct sequencing of the information and the information should not be used for comparison purposes over time.

    Update Frequency: Daily

    Fork this kernel to get started.

    Acknowledgements

    https://www.atlantapd.org/i-want-to/crime-data-downloads

    Dataset Source: City of Atlanta

    This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source —https://www.atlantapd.org/i-want-to/crime-data-downloads — and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

    Banner Photo by https://wallpapermemory.com/199170

    Inspiration

    What categories of crime exhibited the greatest year-over-year increase between 2015 and 2016?

    Which month generally has the greatest number of motor vehicle thefts?

    How does temperature affect the incident rate of violent crime (assault or battery)?

  17. Facebook: distribution of global audiences 2024, by age and gender

    • statista.com
    • de.statista.com
    • +3more
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Facebook: distribution of global audiences 2024, by age and gender [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    As of April 2024, it was found that men between the ages of 25 and 34 years made up Facebook largest audience, accounting for 18.4 percent of global users. Additionally, Facebook's second largest audience base could be found with men aged 18 to 24 years.

                  Facebook connects the world
    
                  Founded in 2004 and going public in 2012, Facebook is one of the biggest internet companies in the world with influence that goes beyond social media. It is widely considered as one of the Big Four tech companies, along with Google, Apple, and Amazon (all together known under the acronym GAFA). Facebook is the most popular social network worldwide and the company also owns three other billion-user properties: mobile messaging apps WhatsApp and Facebook Messenger,
                  as well as photo-sharing app Instagram. Facebook usersThe vast majority of Facebook users connect to the social network via mobile devices. This is unsurprising, as Facebook has many users in mobile-first online markets. Currently, India ranks first in terms of Facebook audience size with 378 million users. The United States, Brazil, and Indonesia also all have more than 100 million Facebook users each.
    
  18. Covid19 Dataset (Worldwide cases 2019-20)

    • kaggle.com
    zip
    Updated Dec 31, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vivekkumar Gediya (2020). Covid19 Dataset (Worldwide cases 2019-20) [Dataset]. https://www.kaggle.com/vivekgediya/covid19-case-worldwide-cases-till-30th-dec20
    Explore at:
    zip(327132 bytes)Available download formats
    Dataset updated
    Dec 31, 2020
    Authors
    Vivekkumar Gediya
    Description

    Context

    From World Health Organization - On 31 December 2019, WHO was alerted to several cases of pneumonia in Wuhan City, Hubei Province of China. The virus did not match any other known virus. This raised concern because when a virus is new, we do not know how it affects people.

    So daily level information on the affected people can give some interesting insights when it is made available to the broader data science community.

    Johns Hopkins University has made an excellent dashboard using the affected cases data. Data is extracted from the google sheets associated and made available here.

    Edited

    Now data is available as csv files in the Johns Hopkins Github repository. Please refer to the github repository for the Terms of Use details. Uploading it here for using it in Kaggle kernels and getting insights from the broader DS community.

    Content 2019 Novel Coronavirus (2019-nCoV) is a virus (more specifically, a coronavirus) identified as the cause of an outbreak of respiratory illness first detected in Wuhan, China. Early on, many of the patients in the outbreak in Wuhan, China reportedly had some link to a large seafood and animal market, suggesting animal-to-person spread. However, a growing number of patients reportedly have not had exposure to animal markets, indicating person-to-person spread is occurring. At this time, it’s unclear how easily or sustainably this virus is spreading between people - CDC

    This dataset has daily level information on the number of affected cases, deaths and recovery from 2019 novel coronavirus. Please note that this is a time series data and so the number of cases on any given day is the cumulative number.

    The data is available from 22 Jan, 2020 to 30 Dec, 2020.

    Sources

    JHU confirmed covid datasets.

  19. Top 200 Youtubers Data (cleaned)

    • kaggle.com
    Updated Jul 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Syed Jafer (2022). Top 200 Youtubers Data (cleaned) [Dataset]. https://www.kaggle.com/syedjaferk/top-200-youtubers-cleaned/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 8, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Syed Jafer
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    YouTube is an American online video sharing and social media platform headquartered in San Bruno, California. It was launched on February 14, 2005, by Steve Chen, Chad Hurley, and Jawed Karim. It is owned by Google, and is the second most visited website, after Google Search. YouTube has more than 2.5 billion monthly users who collectively watch more than one billion hours of videos each day. As of May 2019, videos were being uploaded at a rate of more than 500 hours of content per minute.

    Youtube is very much used to influence, educate, free university (for me also) people (the users followers) in a particular way for a specific issue - which can impact the order in some ways.

  20. Most valuable media & entertainment brands worldwide 2024

    • statista.com
    • es.statista.com
    • +3more
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Julia Faria, Most valuable media & entertainment brands worldwide 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Julia Faria
    Description

    In 2024, Google ranked as the most valuable media and entertainment brand worldwide, with a brand value of 683 billion U.S. dollars. Facebook ranked second, valued at around 167 billion dollars. Part of the Tencent Group, WeChat and v.qq.com (Tencent Video) had a brand value of 56 billion and 17.5 billion dollars, respectively.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
DataSF (2019). San Francisco Open Data [Dataset]. https://www.kaggle.com/datasets/datasf/san-francisco
Organization logo

Data from: San Francisco Open Data

San Francisco Open Data (BigQuery Dataset)

Related Article
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Mar 20, 2019
Dataset authored and provided by
DataSF
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Area covered
San Francisco
Description

Context

DataSF seeks to transform the way that the City of San Francisco works -- through the use of data.

https://datasf.org/about/

Content

This dataset contains the following tables: ['311_service_requests', 'bikeshare_stations', 'bikeshare_status', 'bikeshare_trips', 'film_locations', 'sffd_service_calls', 'sfpd_incidents', 'street_trees']

  • This data includes all San Francisco 311 service requests from July 2008 to the present, and is updated daily. 311 is a non-emergency number that provides access to non-emergency municipal services.
  • This data includes fire unit responses to calls from April 2000 to present and is updated daily. Data contains the call number, incident number, address, unit identifier, call type, and disposition. Relevant time intervals are also included. Because this dataset is based on responses, and most calls involved multiple fire units, there are multiple records for each call number. Addresses are associated with a block number, intersection or call box.
  • This data includes incidents from the San Francisco Police Department (SFPD) Crime Incident Reporting system, from January 2003 until the present (2 weeks ago from current date). The dataset is updated daily. Please note: the SFPD has implemented a new system for tracking crime. This dataset is still sourced from the old system, which is in the process of being retired (a multi-year process).
  • This data includes a list of San Francisco Department of Public Works maintained street trees including: planting date, species, and location. Data includes 1955 to present.

This dataset is deprecated and not being updated.

Fork this kernel to get started with this dataset.

Acknowledgements

http://datasf.org/

Dataset Source: SF OpenData. This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - http://sfgov.org/ - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

Banner Photo by @meric from Unplash.

Inspiration

Which neighborhoods have the highest proportion of offensive graffiti?

Which complaint is most likely to be made using Twitter and in which neighborhood?

What are the most complained about Muni stops in San Francisco?

What are the top 10 incident types that the San Francisco Fire Department responds to?

How many medical incidents and structure fires are there in each neighborhood?

What’s the average response time for each type of dispatched vehicle?

Which category of police incidents have historically been the most common in San Francisco?

What were the most common police incidents in the category of LARCENY/THEFT in 2016?

Which non-criminal incidents saw the biggest reporting change from 2015 to 2016?

What is the average tree diameter?

What is the highest number of a particular species of tree planted in a single year?

Which San Francisco locations feature the largest number of trees?

Search
Clear search
Close search
Google apps
Main menu