11 datasets found

Google Trends
console.cloud.google.com
Updated May 10, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
https://console.cloud.google.com/marketplace/browse?filter=partner:BigQuery%20Public%20Datasets%20Program&hl=ja (2022). Google Trends [Dataset]. https://console.cloud.google.com/marketplace/product/bigquery-public-datasets/google-search-trends?hl=ja
Explore at:
Dataset updated
May 10, 2022
Dataset provided by
Googlehttp://google.com/
BigQueryhttps://cloud.google.com/bigquery
Google Searchhttp://google.com/
Description
The Google Trends dataset will provide critical signals that individual users and businesses alike can leverage to make better data-driven decisions. This dataset simplifies the manual interaction with the existing Google Trends UI by automating and exposing anonymized, aggregated, and indexed search data in BigQuery. This dataset includes the Top 25 stories and Top 25 Rising queries from Google Trends. It will be made available as two separate BigQuery tables, with a set of new top terms appended daily. Each set of Top 25 and Top 25 rising expires after 30 days, and will be accompanied by a rolling five-year window of historical data in 210 distinct locations in the United States. This Google dataset is hosted in Google BigQuery as part of Google Cloud's Datasets solution and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery
d
MLP-based Learnable Window Size Dataset for Bitcoin Market Price
search.dataone.org
dataverse.harvard.edu
Updated Nov 8, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rajabi, Shahab (2023). MLP-based Learnable Window Size Dataset for Bitcoin Market Price [Dataset]. http://doi.org/10.7910/DVN/5YBLKV
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/5YBLKV
Dataset updated
Nov 8, 2023
Dataset provided by
Harvard Dataverse
Authors
Rajabi, Shahab
Description
The dataset of this paper is collected based on Google, Blockchain, and the Bitcoin market. Generally, there is a total of 26 features, however, a feature whose correlation rate is lower than 0.3 between the variations of price and the variations of feature has been eliminated. Hence, a total of 21 practical features including Market capitalization, Trade-volume, Transaction-fees USD, Average confirmation time, Difficulty, High price, Low price, Total hash rate, Block-size, Miners-revenue, N-transactions-total, Google searches, Open price, N-payments-per Block, Total circulating Bitcoin, Cost-per-transaction percent, Fees-USD-per transaction, N-unique-addresses, N-transactions-per block, and Output-volume have been selected. In addition to the values of these features, for each feature, a new one is created that includes the difference between the previous day and the day before the previous day as a supportive feature. From the point of view of the number and history of the dataset used, a total of 1275 training data were used in the proposed model to extract patterns of Bitcoin price and they were collected from 12 Nov 2018 to 4 Jun 2021.
Google energy consumption 2011-2023
statista.com
Updated Oct 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). Google energy consumption 2011-2023 [Dataset]. https://www.statista.com/statistics/788540/energy-consumption-of-google/
Explore at:
Dataset updated
Oct 11, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Worldwide
Description
Google’s energy consumption has increased over the last few years, reaching 25.9 terawatt hours in 2023, up from 12.8 terawatt hours in 2019. The company has made efforts to make its data centers more efficient through customized high-performance servers, using smart temperature and lighting, advanced cooling techniques, and machine learning. Datacenters and energy Through its operations, Google pursues a more sustainable impact on the environment by creating efficient data centers that use less energy than the average, transitioning towards renewable energy, creating sustainable workplaces, and providing its users with the technological means towards a cleaner future for the future generations. Through its efficient data centers, Google has also managed to divert waste from its operations away from landfills. Reducing Google’s carbon footprint Google’s clean energy efforts is also related to their efforts to reduce their carbon footprint. Since their commitment to using 100 percent renewable energy, the company has met their targets largely through solar and wind energy power purchase agreements and buying renewable power from utilities. Google is one of the largest corporate purchasers of renewable energy in the world.
f
#WLIC2016 Most Frequent Terms Roundup
city.figshare.com
bin
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ernesto Priego (2023). #WLIC2016 Most Frequent Terms Roundup [Dataset]. http://doi.org/10.6084/m9.figshare.3749367.v2
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.3749367.v2
Dataset updated
May 31, 2023
Dataset provided by
City, University of London
Authors
Ernesto Priego
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
IFLA stands for The International Federation of Library Associations and Institutions. The IFLA World Library and Information Congress 2016 and 2nd IFLA General Conference and Assembly, ‘Connections. Collaboration. Community’ took place 13–19 August 2016 at the Greater Columbus Convention Center (GCCC) in Columbus, Ohio, United States. The official hashtag of the conference was #WLIC2016.This spreadsheet contains the results of a text analysis of 22327 Tweets publicly labeled with #WLIC2016 between Sunday 14 and Thursday 18 August 2015. The collection of the source dataset was made with a Twitter Archiving Google Spreadsheet and the automated text analysis was done with the Terms tool from Voyant Tools. The spreadsheet contains:A sheet containing a table summarising the source archive A sheet containing a table detailing tweet counts per day. Sheets containing the 'raw' (no stop words, no manual refining) tables of top 300 most frequent terms and their counts for the Sun-Thu corpus and each individual corpus (1 per day).Sheets containing the 'edited' (edited English stop word filter applied, manually refined) tables of top 50 Most frequent terms and their counts for the Sun-Thu corpus and each individual corpus (1 per day).A sheet containing a comparison table of the top 50 per day.Other ConsiderationsOnly Tweets published by accounts with at least one follower were included in the source archive.Both research and experience show that the Twitter search API is not 100% reliable. Large Tweet volumes affect the search collection process. The API might "over-represent the more central users", not offering "an accurate picture of peripheral activity" (González-Bailon, Sandra, et al, 2012).Apart from the filters and limitations already declared, it cannot be guaranteed that each and every Tweet tagged with #WLIC2016 during the indicated period was analysed. The dataset was shared for archival, comparative and indicative educational research purposes only.Only content from public accounts, obtained from the Twitter Search API, was analysed. The source data is also publicly available to all Twitter users via the Twitter Search API and available to anyone with an Internet connection via the Twitter and Twitter Search web client and mobile apps without the need of a Twitter account.This file contains the results of analyses of Tweets that were published openly on the Web with the queried hashtag; the source Tweets are not included. The content of the source Tweets is responsibility of the original authors. Original Tweets are likely to be copyright their individual authors but please check individually. This work is shared to archive, document and encourage open educational research into scholarly activity on Twitter. The resulting dataset does not contain complete Tweets nor Twitter metadata. No private personal information was shared. The collection, analysis and sharing of the data has been enabled and allowed by Twitter's Privacy Policy. The sharing of the results complies with Twitter's Developer Rules of the Road. A hashtag is metadata users choose freely to use so their content is associated, directly linked to and categorised with the chosen hashtag. The purpose and function of hashtags is to organise and describe information/outputs under the relevant label in order to enhance the discoverability of the labeled information/outputs (Tweets in this case). Tweets published publicly by scholars or other professionals during academic conferences are often publicly tagged (labeled) with a hashtag dedicated to the conference in question. This practice used to be the confined to a few 'niche' fields; it is increasingly becoming the norm rather than the exception. Though every reason for Tweeters' use of hashtags cannot be generalised nor predicted, it can be argued that scholarly Twitter users form specialised, self-selecting public professional networks that tend to observe scholarly practices and accepted modes of social and professional behaviour. In general terms it can be argued that scholarly Twitter users willingly and consciously tag their public Tweets with a conference hashtag as a means to network and to promote, report from, reflect on, comment on and generally contribute publicly to the scholarly conversation around conferences. As Twitter users, conference Twitter hashtag contributors have agreed to Twitter's Privacy and data sharing policies. Professional associations like the Modern Language Association and the American Pyschological Association recognise Tweets as citeable scholarly outputs. Archiving scholarly Tweets is a means to preserve this form of rapid online scholarship that otherwise can very likely become unretrievable as time passes; Twitter's search API has well-known temporal limitations for retrospective historical search and collection.Beyond individual Tweets as scholarly outputs, the collective scholarly activity on Twitter around a conference or academic project or event can provide interesting insights for the contemporary history of scholarly communications. Though this work has limitations and might not be thoroughly systematic, it is hoped it can contribute to developing new insights into a discipline's public concerns as expressed on Twitter over time.As it is increasingly recommended for data sharing, the CC-0 license has been applied to the resulting output in the repository. It is important however to bear in mind that some terms appearing in the dataset might be licensed individually differently; copyright of the source Tweets -and sometimes of individual terms- belongs to their authors. Authorial/curatorial/collection work has been performed on the shared file as a curated dataset resulting from analysis, in order to make it available as part of the scholarly record. If this dataset is consulted attribution is always welcome.Ideally for proper reproducibility and to encourage other studies the whole archive dataset should be available. Those wishing to obtain the whole Tweets should still be able to get them themselves via text and data mining methods.
Lead Scoring Dataset
kaggle.com
zip
Updated Aug 17, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Amrita Chatterjee (2020). Lead Scoring Dataset [Dataset]. https://www.kaggle.com/amritachatterjee09/lead-scoring-dataset
Explore at:
zip(411028 bytes)Available download formats
Dataset updated
Aug 17, 2020
Authors
Amrita Chatterjee
Description
Context

An education company named X Education sells online courses to industry professionals. On any given day, many professionals who are interested in the courses land on their website and browse for courses.

The company markets its courses on several websites and search engines like Google. Once these people land on the website, they might browse the courses or fill up a form for the course or watch some videos. When these people fill up a form providing their email address or phone number, they are classified to be a lead. Moreover, the company also gets leads through past referrals. Once these leads are acquired, employees from the sales team start making calls, writing emails, etc. Through this process, some of the leads get converted while most do not. The typical lead conversion rate at X education is around 30%.

Now, although X Education gets a lot of leads, its lead conversion rate is very poor. For example, if, say, they acquire 100 leads in a day, only about 30 of them are converted. To make this process more efficient, the company wishes to identify the most potential leads, also known as ‘Hot Leads’. If they successfully identify this set of leads, the lead conversion rate should go up as the sales team will now be focusing more on communicating with the potential leads rather than making calls to everyone.

There are a lot of leads generated in the initial stage (top) but only a few of them come out as paying customers from the bottom. In the middle stage, you need to nurture the potential leads well (i.e. educating the leads about the product, constantly communicating, etc. ) in order to get a higher lead conversion.

X Education wants to select the most promising leads, i.e. the leads that are most likely to convert into paying customers. The company requires you to build a model wherein you need to assign a lead score to each of the leads such that the customers with higher lead score h have a higher conversion chance and the customers with lower lead score have a lower conversion chance. The CEO, in particular, has given a ballpark of the target lead conversion rate to be around 80%.

Content

Variables Description * Prospect ID - A unique ID with which the customer is identified. * Lead Number - A lead number assigned to each lead procured. * Lead Origin - The origin identifier with which the customer was identified to be a lead. Includes API, Landing Page Submission, etc. * Lead Source - The source of the lead. Includes Google, Organic Search, Olark Chat, etc. * Do Not Email -An indicator variable selected by the customer wherein they select whether of not they want to be emailed about the course or not. * Do Not Call - An indicator variable selected by the customer wherein they select whether of not they want to be called about the course or not. * Converted - The target variable. Indicates whether a lead has been successfully converted or not. * TotalVisits - The total number of visits made by the customer on the website. * Total Time Spent on Website - The total time spent by the customer on the website. * Page Views Per Visit - Average number of pages on the website viewed during the visits. * Last Activity - Last activity performed by the customer. Includes Email Opened, Olark Chat Conversation, etc. * Country - The country of the customer. * Specialization - The industry domain in which the customer worked before. Includes the level 'Select Specialization' which means the customer had not selected this option while filling the form. * How did you hear about X Education - The source from which the customer heard about X Education. * What is your current occupation - Indicates whether the customer is a student, umemployed or employed. * What matters most to you in choosing this course An option selected by the customer - indicating what is their main motto behind doing this course. * Search - Indicating whether the customer had seen the ad in any of the listed items. * Magazine
* Newspaper Article * X Education Forums
* Newspaper * Digital Advertisement * Through Recommendations - Indicates whether the customer came in through recommendations. * Receive More Updates About Our Courses - Indicates whether the customer chose to receive more updates about the courses. * Tags - Tags assigned to customers indicating the current status of the lead. * Lead Quality - Indicates the quality of lead based on the data and intuition the employee who has been assigned to the lead. * Update me on Supply Chain Content - Indicates whether the customer wants updates on the Supply Chain Content. * Get updates on DM Content - Indicates whether the customer wants updates on the DM Content. * Lead Profile - A lead level assigned to each customer based on their profile. * City - The city of the customer. * Asymmetric Activity Index - An index and score assigned to each customer based on their activity and their profile * Asymmetric Profile Index * Asymmetric Activity Score * Asymmetric Profile Score
* I agree to pay the amount through cheque - Indicates whether the customer has agreed to pay the amount through cheque or not. * a free copy of Mastering The Interview - Indicates whether the customer wants a free copy of 'Mastering the Interview' or not. * Last Notable Activity - The last notable activity performed by the student.

Acknowledgements

UpGrad Case Study

Inspiration

Your data will be in front of the world's largest data science community. What questions do you want to see answered?
n
Data from: Recognizing the importance of near-home contact with nature for...
data.niaid.nih.gov
datasetcatalog.nlm.nih.gov
+2more
zip
Updated Aug 29, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Magdalena Lenda; Piotr Skórka; Małgorzata Jaźwa; Hsien-Yung Lin; Edward Nęcka; Piotr Tryjanowski; Dawid Moroń; Johannes M. H. Knops; Hugh P. Possingham (2023). Recognizing the importance of near-home contact with nature for mental well-being based on the COVID-19 lockdown experience [Dataset]. http://doi.org/10.5061/dryad.fn2z34v1h
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.fn2z34v1h
Dataset updated
Aug 29, 2023
Dataset provided by
Xi’an Jiaotong-Liverpool University
Uniwersytet SWPS
Institute of Nature Conservation
University of Life Sciences in Poznań
Carleton University
Institute of Systematics and Evolution of Animals
The University of Queensland
University of Opole
Authors
Magdalena Lenda; Piotr Skórka; Małgorzata Jaźwa; Hsien-Yung Lin; Edward Nęcka; Piotr Tryjanowski; Dawid Moroń; Johannes M. H. Knops; Hugh P. Possingham
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Several urban landscape planning solutions have been introduced around the world to find a balance between developing urban spaces, maintaining and restoring biodiversity, and enhancing quality of human life. Our global mini-review, combined with analysis of big data collected from Google Trends at global scale, reveals the importance of enjoying day-to-day contact with nature and engaging in such activities as nature observation and identification and gardening for the mental well-being of humans during the COVID-19 pandemic. Home-based activities, such as watching birds from one’s window, identifying species of plants and animals, backyard gardening, and collecting information about nature for citizen science projects, were popular during the first lockdown in spring 2020, when people could not easily venture out of their homes. In our mini-review, we found 37 articles from 28 countries with a total sample of 114,466 people. These papers suggest that home-based engagement with nature was an entertaining and pleasant distraction that helped preserve mental well-being during a challenging time. According to Google Trends, interest in such activities increased during lockdown compared to the previous five years. Millions of people worldwide are chronically or temporarily confined to their homes and neighborhoods because of illness, childcare chores, or elderly care responsibility, which makes it difficult for them to travel far to visit such places as national parks, created through land sparing, where people go to enjoy nature and relieve stress. This article posits that for such people, living in an urban landscape designed to facilitate effortless contact with small natural areas is a more effective way to receive the mental health benefits of contact with nature than visiting a sprawling nature park on rare occasions. Methods 1. Identifying the most common types of activities related to nature observation, gardening, and taxa identification during the first lockdown based on scientific articles and non-scientific press For scientific articles, in March 2023 we searched Scopus and Google Scholar. For countries where Google is restricted, such as China, similar results will be available from other scientific browsers, with the highest number of results from our database being available from Scopus. We used the Google Search browser to search for globally published non-scientific press articles. Some selection criteria were applied during article review. Specifically, we excluded articles that were not about the first lockdown; did not study activities at a local scale (from balcony, window, backyard) but rather in areas far away from home (e.g., visiting forests); studied the mental health effect of observing indoor potted plants and pet animals; or transiently mentioned the topic or keyword without going into any scientific detail. We included all papers that met our criteria, that is, studies that analyzed our chosen topic with experiments or planned observations. We included all research papers, but not letters that made claims without any data. Google Scholar automatically screened the title, abstract, keywords, and the whole text of each article for the keywords we entered. All articles that met our criteria were read and double-checked for keywords and content related to the keywords (e.g., synonyms or if they presented content about the relevant topic without using the specific keywords). We identified, from both types of articles, the major nature-based activities that people engaged in during the first lockdown in the spring of 2020. Keywords used in this study were grouped into six main topics: (1) COVID-19 pandemic; (2) nature-oriented activity focused on nature observation, identification of different taxa, or gardening; (3) mental well-being; (4) activities performed from a balcony, window, or in gardens; (5) entertainment; and (6) citizen science (see Table 1 for all keywords). 2. Increase in global trends in interest in nature observation, gardening, and taxa identification during the first lockdown We used the categorical cluster method, which was combined with big data from Google Trends (downloaded on 1 September 2020) and anomaly detection to identify trend anomalies globally in peoples’ interests. We used this combination of methods to examine whether interest in nature-based activities that were mentioned in scientific and nonscientific press articles increased during the first lockdown. Keywords linked with the main types of nature-oriented activities, as identified from press and scientific articles, and used according to the categorical clustering method were classified into the following six main categories: (1) global interest in bird-watching and bird identification combined with citizen science; (2) global interest in plant identification and gardening combined with citizen science; (3) global interest in butterfly watching, (4) local interest in early-spring (lockdown time), summer, or autumn flowering species that usually can be found in Central European (country: Poland) backyards; (5) global interest in traveling and social activities; and (6) global interest in nature areas and activities typically enjoyed during holidays and thus requiring traveling to land-spared nature reserves. The six categories were divided into 15 subcategories so that we could attach relevant words or phrases belonging to the same cluster and typically related to the activity (according to Google Trends and Google browser’s automatic suggestions; e.g., people who searched for “bird-watching” typically also searched for “binoculars,” “bird feeder,” “bird nest,” and “birdhouse”). The subcategories and keywords used for data collection about trends in society’s interest in the studied topic from Google Trends are as follows.

Bird-watching: “binoculars,” “bird feeder,” “bird nest,” “birdhouse,” “bird-watching”; Bird identification: “bird app,” “bird identification,” “bird identification app,” “bird identifier,” “bird song app”; Bird-watching combined with citizen science: “bird guide,” “bird identification,” “eBird,” “feeding birds,” “iNaturalist”; Citizen science and bird-watching apps: “BirdNET,” “BirdSong ID,” “eBird,” “iNaturalist,” “Merlin Bird ID”; Gardening: “gardening,” “planting,” “seedling,” “seeds,” “soil”; Shopping for gardening: “garden shop,” “plant buy,” “plant ebay,” “plant sell,” “plant shop”; Plant identification apps: “FlowerChecker,” “LeafSnap,” “NatureGate,” “Plantifier,” “PlantSnap”; Citizen science and plant identification: “iNaturalist,” “plant app,” “plant check,” “plant identification app,” “plant identifier”; Flowers that were flowering in gardens during lockdown in Poland: “fiołek” (viola), “koniczyna” (shamrock), “mlecz” (dandelion), “pierwiosnek” (primose), “stokrotka” (daisy). They are typical early-spring flowers growing in the gardens in Central Europe. We had to be more specific in this search because there are no plant species blooming across the world at the same time. These plant species have well-known biology; thus, we could easily interpret these results; Flowers that were not flowering during lockdown in Poland: “chaber” (cornflower), “mak” (poppy), “nawłoć” (goldenrod), “róża” (rose), “rumianek” (chamomile). They are typical mid-summer flowering plants often planted in gardens; Interest in traveling long distances and in social activities that involve many people: “airport,” “bus,” “café,” “driving,” “pub”; Single or mass commuting, and traveling: “bike,” “boat,” “car,” “flight,” “train”; Interest in distant places and activities for visiting natural areas: “forest,” “nature park,” “safari,” “trekking,” “trip”; Places and activities for holidays (typically located far away): “coral reef,” “rainforest,” “safari,” “savanna,” “snorkeling”; Butterfly watching: “butterfly watching,” “butterfly identification,” “butterfly app,” “butterfly net,” “butterfly guide”;

In Google Trends, we set the following filters: global search, dates: July 2016–July 2020; language: English.
f
Independent Data Aggregation, Quality Control and Visualization of...
datasetcatalog.nlm.nih.gov
arizona.figshare.com
Updated Oct 21, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ly, Chun; Knott, Cheryl; McCleary, Jill; Castiello-Gutiérrez, Santiago (2020). Independent Data Aggregation, Quality Control and Visualization of University of Arizona COVID-19 Re-Entry Testing Data [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000484783
Explore at:
Dataset updated
Oct 21, 2020
Authors
Ly, Chun; Knott, Cheryl; McCleary, Jill; Castiello-Gutiérrez, Santiago
Description
AbstractThe dataset provided here contains the efforts of independent data aggregation, quality control, and visualization of the University of Arizona (UofA) COVID-19 testing programs for the 2019 novel Coronavirus pandemic. The dataset is provided in the form of machine-readable tables in comma-separated value (.csv) and Microsoft Excel (.xlsx) formats.Additional InformationAs part of the UofA response to the 2019-20 Coronavirus pandemic, testing was conducted on students, staff, and faculty prior to start of the academic year and throughout the school year. These testings were done at the UofA Campus Health Center and through their instance program called "Test All Test Smart" (TATS). These tests identify active cases of SARS-nCoV-2 infections using the reverse transcription polymerase chain reaction (RT-PCR) test and the Antigen test. Because the Antigen test provided more rapid diagnosis, it was greatly used three weeks prior to the start of the Fall semester and throughout the academic year.As these tests were occurring, results were provided on the COVID-19 websites. First, beginning in early March, the Campus Health Alerts website reported the total number of positive cases. Later, numbers were provided for the total number of tests (March 12 and thereafter). According to the website, these numbers were updated daily for positive cases and weekly for total tests. These numbers were reported until early September where they were then included in the reporting for the TATS program.For the TATS program, numbers were provided through the UofA COVID-19 Update website. Initially on August 21, the numbers provided were the total number (July 31 and thereafter) of tests and positive cases. Later (August 25), additional information was provided where both PCR and Antigen testings were available. Here, the daily numbers were also included. On September 3, this website then provided both the Campus Health and TATS data. Here, PCR and Antigen were combined and referred to as "Total", and daily and cumulative numbers were provided.At this time, no official data dashboard was available until September 16, and aside from the information provided on these websites, the full dataset was not made publicly available. As such, the authors of this dataset independently aggregated data from multiple sources. These data were made publicly available through a Google Sheet with graphical illustration provided through the spreadsheet and on social media. The goal of providing the data and illustrations publicly was to provide factual information and to understand the infection rate of SARS-nCoV-2 in the UofA community.Because of differences in reported data between Campus Health and the TATS program, the dataset provides Campus Health numbers on September 3 and thereafter. TATS numbers are provided beginning on August 14, 2020.Description of Dataset ContentThe following terms are used in describing the dataset.1. "Report Date" is the date and time in which the website was updated to reflect the new numbers2. "Test Date" is to the date of testing/sample collection3. "Total" is the combination of Campus Health and TATS numbers4. "Daily" is to the new data associated with the Test Date5. "To Date (07/31--)" provides the cumulative numbers from 07/31 and thereafter6. "Sources" provides the source of information. The number prior to the colon refers to the number of sources. Here, "UACU" refers to the UA COVID-19 Update page, and "UARB" refers to the UA Weekly Re-Entry Briefing. "SS" and "WBM" refers to screenshot (manually acquired) and "Wayback Machine" (see Reference section for links) with initials provided to indicate which author recorded the values. These screenshots are available in the records.zip file.The dataset is distinguished where available by the testing program and the methods of testing. Where data are not available, calculations are made to fill in missing data (e.g., extrapolating backwards on the total number of tests based on daily numbers that are deemed reliable). Where errors are found (by comparing to previous numbers), those are reported on the above Google Sheet with specifics noted.For inquiries regarding the contents of this dataset, please contact the Corresponding Author listed in the README.txt file. Administrative inquiries (e.g., removal requests, trouble downloading, etc.) can be directed to data-management@arizona.edu
Mobile internet users worldwide 2020-2029
statista.com
Updated Feb 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista Research Department (2025). Mobile internet users worldwide 2020-2029 [Dataset]. https://www.statista.com/topics/779/mobile-internet/
Explore at:
Dataset updated
Feb 5, 2025
Dataset provided by
Statistahttp://statista.com/
Authors
Statista Research Department
Description
The global number of smartphone users in was forecast to continuously increase between 2024 and 2029 by in total 1.8 billion users (+42.62 percent). After the ninth consecutive increasing year, the smartphone user base is estimated to reach 6.1 billion users and therefore a new peak in 2029. Notably, the number of smartphone users of was continuously increasing over the past years.Smartphone users here are limited to internet users of any age using a smartphone. The shown figures have been derived from survey data that has been processed to estimate missing demographics.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of smartphone users in countries like Australia & Oceania and Asia.
Global overview of cloud-, snow-, and shade-free Landsat (1982-2024) and...
data.niaid.nih.gov
search.dataone.org
+1more
zip
Updated Apr 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Katarzyna Ewa Lewińska; Stefan Ernst; David Frantz; Ulf Leser; Patrick Hostert (2025). Global overview of cloud-, snow-, and shade-free Landsat (1982-2024) and Sentinel-2 (2015-2024) data [Dataset]. http://doi.org/10.5061/dryad.gb5mkkwxm
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.gb5mkkwxm
Dataset updated
Apr 11, 2025
Dataset provided by
Humboldt-Universität zu Berlin
Trier University of Applied Sciences
Authors
Katarzyna Ewa Lewińska; Stefan Ernst; David Frantz; Ulf Leser; Patrick Hostert
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Landsat and Sentinel-2 acquisitions are among the most frequently used medium-resolution (i.e., 10-30 m) optical data. The data are extensively used in terrestrial vegetation applications, including but not limited to, land cover and land use mapping, vegetation condition and phenology monitoring, and disturbance and change mapping. While the Landsat archives alone provide over 40 years, and counting, of continuous and consistent observations, since mid-2015 Sentinel-2 has enabled a revisit frequency of up to 2-days. Although the spatio-temporal availability of both data archives is well-known at the scene level, information on the actual availability of usable (i.e., cloud-, snow-, and shade-free) observations at the pixel level needs to be explored for each study to ensure correct parametrization of used algorithms, thus robustness of subsequent analyses. However, a priori data exploration is time and resource‑consuming, thus is rarely performed. As a result, the spatio-temporal heterogeneity of usable data is often inadequately accounted for in the analysis design, risking ill-advised selection of algorithms and hypotheses, and thus inferior quality of final results. Here we present a global dataset comprising precomputed daily availability of usable Landsat and Sentinel-2 data sampled at a pixel-level in a regular 0.18°-point grid. We based the dataset on the complete 1982-2024 Landsat surface reflectance data (Collection 2) and 2015-2024 Seninel-2 top-of-the-atmosphere reflectance scenes (pre‑Collection-1 and Collection-1). Derivation of cloud-, snow-, and shade-free observations followed the methodology developed in our recent study on data availability over Europe (Lewińska et al., 2023; https://doi.org/10.20944/preprints202308.2174.v2). Furthermore, we expanded the dataset with growing season information derived based on the 2001‑2019 time series of the yearly 500 m MODIS land cover dynamics product (MCD12Q2; Collection 6). As such, our dataset presents a unique overview of the spatio-temporal availability of usable daily Landsat and Sentinel-2 data at the global scale, hence offering much-needed a priori information aiding the identification of appropriate methods and challenges for terrestrial vegetation analyses at the local to global scales. The dataset can be viewed using the dedicated GEE App (link in Related Works). As of February 2025 the dataset has been extended with the 2024 data. Methods We based our analyses on freely and openly accessible Landsat and Sentinel-2 data archives available in Google Earth Engine (Gorelick et al., 2017). We used all Landsat surface reflectance Level 2, Tier 1, Collection 2 scenes acquired with the Thematic Mapper (TM) (Earth Resources Observation And Science (EROS) Center, 1982), Enhanced Thematic Mapper (ETM+) (Earth Resources Observation And Science (EROS) Center, 1999), and Operational Land Imager (OLI) (Earth Resources Observation And Science (EROS) Center, 2013) scanners between 22nd August 1982 and 31st December 2024, and Sentinel-2 TOA reflectance Level-1C scenes (pre‑Collection-1 (European Space Agency, 2015, 2021) and Collection-1 (European Space Agency, 2022)) acquired with the MultiSpectral Instrument (MSI) between 23rd June 2015 and 31st December 2024. We implemented a conservative pixel-quality screening to identify cloud-, snow-, and shade-free land pixels. For the Landsat time series, we relied on the inherent pixel quality bands (Foga et al., 2017; Zhu & Woodcock, 2012) excluding all pixels flagged as cloud, snow, or shadow as well as pixels with the fill-in value of 20,000 (scale factor 0.0001; (Zhang et al., 2022)). Furthermore, due to the Landsat 7 orbit drift (Qiu et al., 2021) we excluded all ETM+ scenes acquired after 31st December 2020. Because Sentinel-2 Level-2A quality masks lack the desired scope and accuracy (Baetens et al., 2019; Coluzzi et al., 2018), we resorted to Level-1C scenes accompanied by the supporting Cloud Probability product. Furthermore, we employed a selection of conditions, including a threshold on Band 10 (SWIR-Cirrus), which is not available at Level‑2A. Overall, our Sentinel-2-specific cloud, shadow, and snow screening comprised:

exclusion of all pixels flagged as clouds and cirrus in the inherent ‘QA60’ cloud mask band; exclusion of all pixels with cloud probability >50% as defined in the corresponding Cloud Probability product available for each scene; exclusion of cirrus clouds (B10 reflectance >0.01); exclusion of clouds based on Cloud Displacement Analysis (CDI<‑0.5) (Frantz et al., 2018); exclusion of dark pixels (B8 reflectance <0.16) within cloud shadows modelled for each scene with scene‑specific sun parameters for the clouds identified in the previous steps. Here we assumed a cloud height of 2,000 m. exclusion of pixels within a 40-m buffer (two pixels at 20-m resolution) around each identified cloud and cloud shadow object. exclusion of snow pixels identified with a snow mask branch of the Sen2Cor processor (Main-Knorn et al., 2017).

Through applying the data screening, we generated a collection of daily availability records for Landsat and Sentinel-2 data archives. We next subsampled the resulting binary time series with a regular 0.18° x 0.18°‑point grid defined in the EPSG:4326 projection, obtaining 475,150 points located over land between ‑179.8867°W and 179.5733°E and 83.50834°N and ‑59.05167°S. Owing to the substantial amount of data comprised in the Landsat and Sentinel-2 archives and the computationally demanding process of cloud-, snow-, and shade-screening, we performed the subsampling in batches corresponding to a 4° x 4° regular grid and consolidated the final data in post-processing. We derived the pixel-specific growing season information from the 2001-2019 time series of the yearly 500‑m MODIS land cover dynamics product (MCD12Q2; Collection 6) available in Google Earth Engine. We only used information on the start and the end of a growing season, excluding all pixels with quality below ‘best’. When a pixel went through more than one growing cycle per year, we approximated a growing season as the period between the beginning of the first growing cycle and the end of the last growing cycle. To fill in data gaps arising from low-quality data and insufficiently pronounced seasonality (Friedl et al., 2019), we used a 5x5 mean moving window filter to ensure better spatial continuity of our growing season datasets. Following (Lewińska et al., 2023), we defined the start of the season as the pixel-specific 25th percentile of the 2001-2019 distribution for the start of the season dates, and the end of the season as the pixel-specific 75th percentile of the 2001-2019 distribution for end of the season dates. Finally, we subsampled the start and end of the season datasets with the same regular 0.18° x 0.18°-point grid defined in the EPSG:4326 projection. References:

Baetens, L., Desjardins, C., & Hagolle, O. (2019). Validation of Copernicus Sentinel-2 Cloud Masks Obtained from MAJA, Sen2Cor, and FMask Processors Using Reference Cloud Masks Generated with a Supervised Active Learning Procedure. Remote Sensing, 11(4), 433. https://doi.org/10.3390/rs11040433 Coluzzi, R., Imbrenda, V., Lanfredi, M., & Simoniello, T. (2018). A first assessment of the Sentinel-2 Level 1-C cloud mask product to support informed surface analyses. Remote Sensing of Environment, 217, 426–443. https://doi.org/10.1016/j.rse.2018.08.009 Earth Resources Observation And Science (EROS) Center. (1982). Collection-2 Landsat 4-5 Thematic Mapper (TM) Level-1 Data Products [Other]. U.S. Geological Survey. https://doi.org/10.5066/P918ROHC Earth Resources Observation And Science (EROS) Center. (1999). Collection-2 Landsat 7 Enhanced Thematic Mapper Plus (ETM+) Level-1 Data Products [dataset]. U.S. Geological Survey. https://doi.org/10.5066/P9TU80IG Earth Resources Observation And Science (EROS) Center. (2013). Collection-2 Landsat 8-9 OLI (Operational Land Imager) and TIRS (Thermal Infrared Sensor) Level-1 Data Products [Other]. U.S. Geological Survey. https://doi.org/10.5066/P975CC9B European Space Agency. (2015). Sentinel-2 MSI Level-1C TOA Reflectance [dataset]. European Space Agency. https://doi.org/10.5270/S2_-d8we2fl European Space Agency. (2021). Sentinel-2 MSI Level-1C TOA Reflectance, Collection 0 [dataset]. European Space Agency. https://doi.org/10.5270/S2_-d8we2fl European Space Agency. (2022). Sentinel-2 MSI Level-1C TOA Reflectance [dataset]. European Space Agency. https://doi.org/10.5270/S2_-742ikth Foga, S., Scaramuzza, P. L., Guo, S., Zhu, Z., Dilley, R. D., Beckmann, T., Schmidt, G. L., Dwyer, J. L., Joseph Hughes, M., & Laue, B. (2017). Cloud detection algorithm comparison and validation for operational Landsat data products. Remote Sensing of Environment, 194, 379–390. https://doi.org/10.1016/j.rse.2017.03.026 Frantz, D., Haß, E., Uhl, A., Stoffels, J., & Hill, J. (2018). Improvement of the Fmask algorithm for Sentinel-2 images: Separating clouds from bright surfaces based on parallax effects. Remote Sensing of Environment, 215, 471–481. https://doi.org/10.1016/j.rse.2018.04.046 Friedl, M., Josh, G., & Sulla-Menashe, D. (2019). MCD12Q2 MODIS/Terra+Aqua Land Cover Dynamics Yearly L3 Global 500m SIN Grid V006 [dataset]. NASA EOSDIS Land Processes DAAC. https://doi.org/10.5067/MODIS/MCD12Q2.006 Gorelick, N., Hancher, M., Dixon, M., Ilyushchenko, S., Thau, D., & Moore, R. (2017). Google Earth Engine: Planetary-scale geospatial analysis for everyone. Remote Sensing of Environment, 202, 18–27. https://doi.org/10.1016/j.rse.2017.06.031Lewińska K.E., Ernst S., Frantz D., Leser U., Hostert P., Global Overview of Usable Landsat and Sentinel-2 Data for 1982–2023. Data in Brief 57, (2024) https://doi.org/10.1016/j.dib.2024.111054 Main-Knorn, M., Pflug, B., Louis, J., Debaecker, V., Müller-Wilm, U., & Gascon, F. (2017). Sen2Cor for Sentinel-2. In L. Bruzzone, F. Bovolo,
f
Monthly and daily datasets.
plos.figshare.com
xlsx
Updated Jan 5, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Maximiliano Lizana; Charisma Choudhury; David Watling (2024). Monthly and daily datasets. [Dataset]. http://doi.org/10.1371/journal.pone.0296686.s003
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0296686.s003
Dataset updated
Jan 5, 2024
Dataset provided by
PLOS ONE
Authors
Maximiliano Lizana; Charisma Choudhury; David Watling
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Aggregated mobility indices (AMIs) derived from information and communications technologies have recently emerged as a new data source for transport planners, with particular value during periods of major disturbances or when other sources of mobility data are scarce. Particularly, indices estimated on the aggregate user concentration in public transport (PT) hubs based on GPS of smartphones, or the number of PT navigation queries in smartphone applications have been used as proxies for the temporal changes in PT aggregate demand levels. Despite the popularity of these indices, it remains largely untested whether they can provide a reasonable characterisation of actual PT ridership changes. This study aims to address this research gap by investigating the reliability of using AMIs for inferring PT ridership changes by offering the first rigorous benchmarking between them and ridership data derived from smart card validations and tickets. For the comparison, we use monthly and daily ridership data from 12 cities worldwide and two AMIs shared globally by Google and Apple during periods of major change in 2020–22. We also explore the complementary role of AMIs on traditional ridership data. The comparative analysis revealed that the index based on human mobility (Google) exhibited a notable alignment with the trends reported by ridership data and performed better than the one based on PT queries (Apple). Our results differ from previous studies by showing that AMIs performed considerably better for similar periods. This finding highlights the huge relevance of dealing with methodological differences in datasets before comparing. Moreover, we demonstrated that AMIs can also complement data from smart card records when ticketing is missing or of doubtful quality. The outcomes of this study are particularly relevant for cities of developing countries, which usually have limited data to analyse their PT ridership, and AMIs may offer an attractive alternative.
Mobile internet usage reach in North America 2020-2029
statista.com
Updated Feb 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista Research Department (2025). Mobile internet usage reach in North America 2020-2029 [Dataset]. https://www.statista.com/topics/779/mobile-internet/
Explore at:
Dataset updated
Feb 5, 2025
Dataset provided by
Statistahttp://statista.com/
Authors
Statista Research Department
Description
The population share with mobile internet access in North America was forecast to increase between 2024 and 2029 by in total 2.9 percentage points. This overall increase does not happen continuously, notably not in 2028 and 2029. The mobile internet penetration is estimated to amount to 84.21 percent in 2029. Notably, the population share with mobile internet access of was continuously increasing over the past years.The penetration rate refers to the share of the total population having access to the internet via a mobile broadband connection.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the population share with mobile internet access in countries like Caribbean and Europe.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

https://console.cloud.google.com/marketplace/browse?filter=partner:BigQuery%20Public%20Datasets%20Program&hl=ja (2022). Google Trends [Dataset]. https://console.cloud.google.com/marketplace/product/bigquery-public-datasets/google-search-trends?hl=ja

Google Trends

Explore at:

Dataset updated

May 10, 2022

Dataset provided by

Googlehttp://google.com/
BigQueryhttps://cloud.google.com/bigquery
Google Searchhttp://google.com/

Description

The Google Trends dataset will provide critical signals that individual users and businesses alike can leverage to make better data-driven decisions. This dataset simplifies the manual interaction with the existing Google Trends UI by automating and exposing anonymized, aggregated, and indexed search data in BigQuery. This dataset includes the Top 25 stories and Top 25 Rising queries from Google Trends. It will be made available as two separate BigQuery tables, with a set of new top terms appended daily. Each set of Top 25 and Top 25 rising expires after 30 days, and will be accompanied by a rolling five-year window of historical data in 210 distinct locations in the United States. This Google dataset is hosted in Google BigQuery as part of Google Cloud's Datasets solution and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery

Clear search

Close search

Google apps

Main menu

Google Trends

MLP-based Learnable Window Size Dataset for Bitcoin Market Price

Google energy consumption 2011-2023

#WLIC2016 Most Frequent Terms Roundup

Lead Scoring Dataset

Context

Content

Acknowledgements

Inspiration

Data from: Recognizing the importance of near-home contact with nature for...

Independent Data Aggregation, Quality Control and Visualization of...

Mobile internet users worldwide 2020-2029

Global overview of cloud-, snow-, and shade-free Landsat (1982-2024) and...

Monthly and daily datasets.

Mobile internet usage reach in North America 2020-2029

Google TrendsSee More Versions

Google Trends