33 datasets found
  1. f

    Table1_Reliability of Google Trends: Analysis of the Limits and Potential of...

    • frontiersin.figshare.com
    docx
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alessandro Rovetta (2023). Table1_Reliability of Google Trends: Analysis of the Limits and Potential of Web Infoveillance During COVID-19 Pandemic and for Future Research.DOCX [Dataset]. http://doi.org/10.3389/frma.2021.670226.s001
    Explore at:
    docxAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Frontiers
    Authors
    Alessandro Rovetta
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Background: Alongside the COVID-19 pandemic, government authorities around the world have had to face a growing infodemic capable of causing serious damages to public health and economy. In this context, the use of infoveillance tools has become a primary necessity.Objective: The aim of this study is to test the reliability of a widely used infoveillance tool which is Google Trends. In particular, the paper focuses on the analysis of relative search volumes (RSVs) quantifying their dependence on the day they are collected.Methods: RSVs of the query coronavirus + covid during February 1—December 4, 2020 (period 1), and February 20—May 18, 2020 (period 2), were collected daily by Google Trends from December 8 to 27, 2020. The survey covered Italian regions and cities, and countries and cities worldwide. The search category was set to all categories. Each dataset was analyzed to observe any dependencies of RSVs from the day they were gathered. To do this, by calling i the country, region, or city under investigation and j the day its RSV was collected, a Gaussian distribution Xi=X(σi,x¯i) was used to represent the trend of daily variations of xij=RSVsij. When a missing value was revealed (anomaly), the affected country, region or city was excluded from the analysis. When the anomalies exceeded 20% of the sample size, the whole sample was excluded from the statistical analysis. Pearson and Spearman correlations between RSVs and the number of COVID-19 cases were calculated day by day thus to highlight any variations related to the day RSVs were collected. Welch’s t-test was used to assess the statistical significance of the differences between the average RSVs of the various countries, regions, or cities of a given dataset. Two RSVs were considered statistical confident when t

  2. Google energy consumption 2011-2023

    • statista.com
    Updated Oct 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista, Google energy consumption 2011-2023 [Dataset]. https://www.statista.com/statistics/788540/energy-consumption-of-google/
    Explore at:
    Dataset updated
    Oct 11, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    Google’s energy consumption has increased over the last few years, reaching 25.9 terawatt hours in 2023, up from 12.8 terawatt hours in 2019. The company has made efforts to make its data centers more efficient through customized high-performance servers, using smart temperature and lighting, advanced cooling techniques, and machine learning. Datacenters and energy Through its operations, Google pursues a more sustainable impact on the environment by creating efficient data centers that use less energy than the average, transitioning towards renewable energy, creating sustainable workplaces, and providing its users with the technological means towards a cleaner future for the future generations. Through its efficient data centers, Google has also managed to divert waste from its operations away from landfills. Reducing Google’s carbon footprint Google’s clean energy efforts is also related to their efforts to reduce their carbon footprint. Since their commitment to using 100 percent renewable energy, the company has met their targets largely through solar and wind energy power purchase agreements and buying renewable power from utilities. Google is one of the largest corporate purchasers of renewable energy in the world.

  3. K

    Kuwait Google Search Trends: Online Shopping: Macy's

    • ceicdata.com
    Updated Dec 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2024). Kuwait Google Search Trends: Online Shopping: Macy's [Dataset]. https://www.ceicdata.com/en/kuwait/google-search-trends-by-categories/google-search-trends-online-shopping-macys
    Explore at:
    Dataset updated
    Dec 15, 2024
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Sep 20, 2022 - Oct 1, 2022
    Area covered
    Kuwait
    Description

    Kuwait Google Search Trends: Online Shopping: Macy's data was reported at 0.000 Score in 11 May 2024. This stayed constant from the previous number of 0.000 Score for 10 May 2024. Kuwait Google Search Trends: Online Shopping: Macy's data is updated daily, averaging 0.000 Score from Dec 2021 (Median) to 11 May 2024, with 893 observations. The data reached an all-time high of 25.000 Score in 09 Jul 2022 and a record low of 0.000 Score in 12 May 2024. Kuwait Google Search Trends: Online Shopping: Macy's data remains active status in CEIC and is reported by Google Trends. The data is categorized under Global Database’s Kuwait – Table KW.Google.GT: Google Search Trends: by Categories.

  4. Global market share of leading desktop search engines 2015-2025

    • statista.com
    • abripper.com
    Updated Apr 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Global market share of leading desktop search engines 2015-2025 [Dataset]. https://www.statista.com/statistics/216573/worldwide-market-share-of-search-engines/
    Explore at:
    Dataset updated
    Apr 28, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jan 2015 - Mar 2025
    Area covered
    Worldwide
    Description

    As of March 2025, Google represented 79.1 percent of the global online search engine market on desktop devices. Despite being much ahead of its competitors, this represents the lowest share ever recorded by the search engine in these devices for over two decades. Meanwhile, its long-time competitor Bing accounted for 12.21 percent, as tools like Yahoo and Yandex held shares of over 2.9 percent each. Google and the global search market Ever since the introduction of Google Search in 1997, the company has dominated the search engine market, while the shares of all other tools has been rather lopsided. The majority of Google revenues are generated through advertising. Its parent corporation, Alphabet, was one of the biggest internet companies worldwide as of 2024, with a market capitalization of 2.02 trillion U.S. dollars. The company has also expanded its services to mail, productivity tools, enterprise products, mobile devices, and other ventures. As a result, Google earned one of the highest tech company revenues in 2024 with roughly 348.16 billion U.S. dollars. Search engine usage in different countries Google is the most frequently used search engine worldwide. But in some countries, its alternatives are leading or competing with it to some extent. As of the last quarter of 2023, more than 63 percent of internet users in Russia used Yandex, whereas Google users represented little over 33 percent. Meanwhile, Baidu was the most used search engine in China, despite a strong decrease in the percentage of internet users in the country accessing it. In other countries, like Japan and Mexico, people tend to use Yahoo along with Google. By the end of 2024, nearly half of the respondents in Japan said that they had used Yahoo in the past four weeks. In the same year, over 21 percent of users in Mexico said they used Yahoo.

  5. C

    Quick Local SEO Wins: 2025 30-Day Implementation Plan

    • caseysseo.com
    html
    Updated Jul 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Casey Miller (2025). Quick Local SEO Wins: 2025 30-Day Implementation Plan [Dataset]. https://caseysseo.com/quick-local-seo-wins-2025-30-day-implementation-plan
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Jul 25, 2025
    Dataset provided by
    Casey's SEO
    Authors
    Casey Miller
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2025
    Area covered
    Colorado Springs
    Variables measured
    Local Search Usage, Military Population, Local Search Queries, Mobile Search Growth, Local Search Conversion, Generative AI in Search (SGE) Adoption
    Measurement technique
    Google Webmaster Guidelines, Local SEO expert interviews, Industry research and data analysis, Customer surveys
    Description

    This dataset provides a comprehensive 30-day implementation plan with actionable strategies and techniques to help local businesses boost their visibility in local search results and Google Maps. The plan covers critical areas such as optimizing Google Business Profiles, improving mobile user experience, generating more online reviews, and leveraging emerging AI technologies for local search in 2025.

  6. Data (i.e., evidence) about evidence based medicine

    • figshare.com
    • search.datacite.org
    png
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jorge H Ramirez (2023). Data (i.e., evidence) about evidence based medicine [Dataset]. http://doi.org/10.6084/m9.figshare.1093997.v24
    Explore at:
    pngAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Jorge H Ramirez
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Update — December 7, 2014. – Evidence-based medicine (EBM) is not working for many reasons, for example: 1. Incorrect in their foundations (paradox): hierarchical levels of evidence are supported by opinions (i.e., lowest strength of evidence according to EBM) instead of real data collected from different types of study designs (i.e., evidence). http://dx.doi.org/10.6084/m9.figshare.1122534 2. The effect of criminal practices by pharmaceutical companies is only possible because of the complicity of others: healthcare systems, professional associations, governmental and academic institutions. Pharmaceutical companies also corrupt at the personal level, politicians and political parties are on their payroll, medical professionals seduced by different types of gifts in exchange of prescriptions (i.e., bribery) which very likely results in patients not receiving the proper treatment for their disease, many times there is no such thing: healthy persons not needing pharmacological treatments of any kind are constantly misdiagnosed and treated with unnecessary drugs. Some medical professionals are converted in K.O.L. which is only a puppet appearing on stage to spread lies to their peers, a person supposedly trained to improve the well-being of others, now deceits on behalf of pharmaceutical companies. Probably the saddest thing is that many honest doctors are being misled by these lies created by the rules of pharmaceutical marketing instead of scientific, medical, and ethical principles. Interpretation of EBM in this context was not anticipated by their creators. “The main reason we take so many drugs is that drug companies don’t sell drugs, they sell lies about drugs.” ―Peter C. Gøtzsche “doctors and their organisations should recognise that it is unethical to receive money that has been earned in part through crimes that have harmed those people whose interests doctors are expected to take care of. Many crimes would be impossible to carry out if doctors weren’t willing to participate in them.” —Peter C Gøtzsche, The BMJ, 2012, Big pharma often commits corporate crime, and this must be stopped. Pending (Colombia): Health Promoter Entities (In Spanish: EPS ―Empresas Promotoras de Salud).

    1. Misinterpretations New technologies or concepts are difficult to understand in the beginning, it doesn’t matter their simplicity, we need to get used to new tools aimed to improve our professional practice. Probably the best explanation is here in these videos (credits to Antonio Villafaina for sharing these videos with me). English https://www.youtube.com/watch?v=pQHX-SjgQvQ&w=420&h=315 Spanish https://www.youtube.com/watch?v=DApozQBrlhU&w=420&h=315 ----------------------- Hypothesis: hierarchical levels of evidence based medicine are wrong Dear Editor, I have data to support the hypothesis described in the title of this letter. Before rejecting the null hypothesis I would like to ask the following open question:Could you support with data that hierarchical levels of evidence based medicine are correct? (1,2) Additional explanation to this question: – Only respond to this question attaching publicly available raw data.– Be aware that more than a question this is a challenge: I have data (i.e., evidence) which is contrary to classic (i.e., McMaster) or current (i.e., Oxford) hierarchical levels of evidence based medicine. An important part of this data (but not all) is publicly available. References
    2. Ramirez, Jorge H (2014): The EBM challenge. figshare. http://dx.doi.org/10.6084/m9.figshare.1135873
    3. The EBM Challenge Day 1: No Answers. Competing interests: I endorse the principles of open data in human biomedical research Read this letter on The BMJ – August 13, 2014.http://www.bmj.com/content/348/bmj.g3725/rr/762595Re: Greenhalgh T, et al. Evidence based medicine: a movement in crisis? BMJ 2014; 348: g3725. _ Fileset contents Raw data: Excel archive: Raw data, interactive figures, and PubMed search terms. Google Spreadsheet is also available (URL below the article description). Figure 1. Unadjusted (Fig 1A) and adjusted (Fig 1B) PubMed publication trends (01/01/1992 to 30/06/2014). Figure 2. Adjusted PubMed publication trends (07/01/2008 to 29/06/2014) Figure 3. Google search trends: Jan 2004 to Jun 2014 / 1-week periods. Figure 4. PubMed publication trends (1962-2013) systematic reviews and meta-analysis, clinical trials, and observational studies.
      Figure 5. Ramirez, Jorge H (2014): Infographics: Unpublished US phase 3 clinical trials (2002-2014) completed before Jan 2011 = 50.8%. figshare.http://dx.doi.org/10.6084/m9.figshare.1121675 Raw data: "13377 studies found for: Completed | Interventional Studies | Phase 3 | received from 01/01/2002 to 01/01/2014 | Worldwide". This database complies with the terms and conditions of ClinicalTrials.gov: http://clinicaltrials.gov/ct2/about-site/terms-conditions Supplementary Figures (S1-S6). PubMed publication delay in the indexation processes does not explain the descending trends in the scientific output of evidence-based medicine. Acknowledgments I would like to acknowledge the following persons for providing valuable concepts in data visualization and infographics:
    4. Maria Fernanda Ramírez. Professor of graphic design. Universidad del Valle. Cali, Colombia.
    5. Lorena Franco. Graphic design student. Universidad del Valle. Cali, Colombia. Related articles by this author (Jorge H. Ramírez)
    6. Ramirez JH. Lack of transparency in clinical trials: a call for action. Colomb Med (Cali) 2013;44(4):243-6. URL: http://www.ncbi.nlm.nih.gov/pubmed/24892242
    7. Ramirez JH. Re: Evidence based medicine is broken (17 June 2014). http://www.bmj.com/node/759181
    8. Ramirez JH. Re: Global rules for global health: why we need an independent, impartial WHO (19 June 2014). http://www.bmj.com/node/759151
    9. Ramirez JH. PubMed publication trends (1992 to 2014): evidence based medicine and clinical practice guidelines (04 July 2014). http://www.bmj.com/content/348/bmj.g3725/rr/759895 Recommended articles
    10. Greenhalgh Trisha, Howick Jeremy,Maskrey Neal. Evidence based medicine: a movement in crisis? BMJ 2014;348:g3725
    11. Spence Des. Evidence based medicine is broken BMJ 2014; 348:g22
    12. Schünemann Holger J, Oxman Andrew D,Brozek Jan, Glasziou Paul, JaeschkeRoman, Vist Gunn E et al. Grading quality of evidence and strength of recommendations for diagnostic tests and strategies BMJ 2008; 336:1106
    13. Lau Joseph, Ioannidis John P A, TerrinNorma, Schmid Christopher H, OlkinIngram. The case of the misleading funnel plot BMJ 2006; 333:597
    14. Moynihan R, Henry D, Moons KGM (2014) Using Evidence to Combat Overdiagnosis and Overtreatment: Evaluating Treatments, Tests, and Disease Definitions in the Time of Too Much. PLoS Med 11(7): e1001655. doi:10.1371/journal.pmed.1001655
    15. Katz D. A-holistic view of evidence based medicinehttp://thehealthcareblog.com/blog/2014/05/02/a-holistic-view-of-evidence-based-medicine/ ---
  7. M

    Myanmar Google Search Trends: Online Shopping: Costco

    • ceicdata.com
    Updated Nov 29, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2022). Myanmar Google Search Trends: Online Shopping: Costco [Dataset]. https://www.ceicdata.com/en/myanmar/google-search-trends-by-categories
    Explore at:
    Dataset updated
    Nov 29, 2022
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Nov 1, 2022 - Nov 12, 2022
    Area covered
    Myanmar (Burma)
    Description

    Google Search Trends: Online Shopping: Costco data was reported at 0.000 Score in 12 Nov 2022. This stayed constant from the previous number of 0.000 Score for 11 Nov 2022. Google Search Trends: Online Shopping: Costco data is updated daily, averaging 0.000 Score from Dec 2021 (Median) to 12 Nov 2022, with 347 observations. The data reached an all-time high of 51.000 Score in 16 Dec 2021 and a record low of 0.000 Score in 12 Nov 2022. Google Search Trends: Online Shopping: Costco data remains active status in CEIC and is reported by Google Trends. The data is categorized under Global Database’s Myanmar – Table MM.Google.GT: Google Search Trends: by Categories.

  8. f

    UoS Buildings Image Dataset for Computer Vision Algorithms

    • salford.figshare.com
    application/x-gzip
    Updated Jan 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ali Alameer; Mazin Al-Mosawy (2025). UoS Buildings Image Dataset for Computer Vision Algorithms [Dataset]. http://doi.org/10.17866/rd.salford.20383155.v2
    Explore at:
    application/x-gzipAvailable download formats
    Dataset updated
    Jan 23, 2025
    Dataset provided by
    University of Salford
    Authors
    Ali Alameer; Mazin Al-Mosawy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset for this project is represented by photos, photos for the buildings of the University of Salford, these photos are taken by a mobile phone camera from different angels and different distances , even though this task sounds so easy but it encountered some challenges, these challenges are summarized below: 1. Obstacles. a. Fixed or unremovable objects. When taking several photos for a building or a landscape from different angels and directions ,there are some of these angels blocked by a form of a fixed object such as trees and plants, light poles, signs, statues, cabins, bicycle shades, scooter stands, generators/transformers, construction barriers, construction equipment and any other service equipment so it is unavoidable to represent some photos without these objects included, this will raise 3 questions. - will these objects confuse the model/application we intend to create meaning will that obstacle prevent the model/application from identifying the designated building? - Or will the photos be more precise with these objects and provide the capability for the model/application to identify these building with these obstacles included? - How far is the maximum length for detection? In other words, how far will the mobile device with the application be from the building so it could or could not detect the designated building? b. Removable and moving objects. - Any University is crowded with staff and students especially in the rush hours of the day so it is hard for some photos to be taken without a personnel appearing in that photo in a certain time period of the day. But, due to privacy issues and showing respect to that person, these photos are better excluded. - Parked vehicles, trollies and service equipment can be an obstacle and might appear in these images as well as it can block access to some areas which an image from a certain angel cannot be obtained. - Animals, like dogs, cats, birds or even squirrels cannot be avoided in some photos which are entitled to the same questions above.
    2. Weather. In a deep learning project, more data means more accuracy and less error, at this stage of our project it was agreed to have 50 photos per building but we can increase the number of photos for more accurate results but due to the limitation of time for this project it was agreed for 50 per building only. these photos were taken on cloudy days and to expand our work on this project (as future works and recommendations). Photos on sunny, rainy, foggy, snowy and any other weather condition days can be included. Even photos in different times of the day can be included such as night, dawn, and sunset times. To provide our designated model with all the possibilities to identify these buildings in all available circumstances.

    1. The selected buildings. It was agreed to select 10 buildings only from the University of Salford buildings for this project with at least 50 images per building, these selected building for this project with the number of images taken are:
    2. Chapman: 74 images
    3. Clifford Whitworth Library: 60 images
    4. Cockcroft: 67 images
    5. Maxwell: 80 images
    6. Media City Campus: 92 images
    7. New Adelphi: 93 images
    8. New Science, Engineering & Environment: 78 images
    9. Newton: 92 images
    10. Sports Centre: 55 images
    11. University House: 60 images Peel building is an important figure of the University of Salford due to its distinct and amazing exterior design but unfortunately it was excluded from the selection due to some maintenance activities at the time of collecting the photos for this project as it is partially covered with scaffolding and a lot of movement by personnel and equipment. If the supervisor suggests that this will be another challenge to include in the project then, it is mandatory to collect its photos. There are many other buildings in the University of Salford and again to expand our project in the future, we can include all the buildings of the University of Salford. The full list of buildings of the university can be reviewed by accessing an interactive map on: www.salford.ac.uk/find-us

    12. Expand Further. This project can be improved furthermore with so many capabilities, again due to the limitation of time given to this project , these improvements can be implemented later as future works. In simple words, this project is to create an application that can display the building’s name when pointing a mobile device with a camera to that building. Future featured to be added: a. Address/ location: this will require collection of additional data which is the longitude and latitude of each building included or the post code which will be the same taking under consideration how close these buildings appear on the interactive map application such as Google maps, Google earth or iMaps. b. Description of the building: what is the building for, by which school is this building occupied? and what facilities are included in this building? c. Interior Images: all the photos at this stage were taken for the exterior of the buildings, will interior photos make an impact on the model/application for example, if the user is inside newton or chapman and opens the application, will the building be identified especially the interior of these buildings have a high level of similarity for the corridors, rooms, halls, and labs? Will the furniture and assets will be as obstacles or identification marks? d. Directions to a specific area/floor inside the building: if the interior images succeed with the model/application, it would be a good idea adding a search option to the model/application so it can guide the user to a specific area showing directions to that area, for example if the user is inside newton building and searches for lab 141 it will direct him to the first floor of the building with an interactive arrow that changes while the user is approaching his destination. Or, if the application can identify the building from its interior, a drop down list will be activated with each floor of this building, for example, if the model/application identifies Newton building, the drop down list will be activated and when pressing on that drop down list it will represent interactive tabs for each floor of the building, selecting one of the floors by clicking on its tab will display the facilities on that floor for example if the user presses on floor 1 tab, another screen will appear displaying which facilities are on that floor. Furthermore, if the model/application identifies another building, it should activate a different number of floors as buildings differ in the number of floors from each other. this feature can be improved with a voice assistant that can direct the user after he applies his search (something similar to the voice assistant in Google maps but applied to the interior of the university’s buildings. e. Top View: if a drone with a camera can be afforded, it can provide arial images and top views for the buildings that can be added to the model/application but these images can be similar to the interior images situation , the buildings can be similar to each other from the top with other obstacles included like water tanks and AC units.

    13. Other Questions:

    14. Will the model/application be reproducible? the presumed answer for this question should be YES, IF, the model/application will be fed with the proper data (images) such as images of restaurants, schools, supermarkets, hospitals, government facilities...etc.

  9. COVID19 - The New York Times

    • kaggle.com
    zip
    Updated May 18, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Google BigQuery (2020). COVID19 - The New York Times [Dataset]. https://www.kaggle.com/bigquery/covid19-nyt
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    May 18, 2020
    Dataset provided by
    BigQueryhttps://cloud.google.com/bigquery
    Authors
    Google BigQuery
    Description

    Context

    This is the US Coronavirus data repository from The New York Times . This data includes COVID-19 cases and deaths reported by state and county. The New York Times compiled this data based on reports from state and local health agencies. More information on the data repository is available here . For additional reporting and data visualizations, see The New York Times’ U.S. coronavirus interactive site

    Sample Queries

    Query 1

    Which US counties have the most confirmed cases per capita? This query determines which counties have the most cases per 100,000 residents. Note that this may differ from similar queries of other datasets because of differences in reporting lag, methodologies, or other dataset differences.

    SELECT covid19.county, covid19.state_name, total_pop AS county_population, confirmed_cases, ROUND(confirmed_cases/total_pop *100000,2) AS confirmed_cases_per_100000, deaths, ROUND(deaths/total_pop *100000,2) AS deaths_per_100000 FROM bigquery-public-data.covid19_nyt.us_counties covid19 JOIN bigquery-public-data.census_bureau_acs.county_2017_5yr acs ON covid19.county_fips_code = acs.geo_id WHERE date = DATE_SUB(CURRENT_DATE(),INTERVAL 1 day) AND covid19.county_fips_code != "00000" ORDER BY confirmed_cases_per_100000 desc

    Query 2

    How do I calculate the number of new COVID-19 cases per day? This query determines the total number of new cases in each state for each day available in the dataset SELECT b.state_name, b.date, MAX(b.confirmed_cases - a.confirmed_cases) AS daily_confirmed_cases FROM (SELECT state_name AS state, state_fips_code , confirmed_cases, DATE_ADD(date, INTERVAL 1 day) AS date_shift FROM bigquery-public-data.covid19_nyt.us_states WHERE confirmed_cases + deaths > 0) a JOIN bigquery-public-data.covid19_nyt.us_states b ON a.state_fips_code = b.state_fips_code AND a.date_shift = b.date GROUP BY b.state_name, date ORDER BY date desc

  10. d

    MLP-based Learnable Window Size Dataset for Bitcoin Market Price

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 8, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rajabi, Shahab (2023). MLP-based Learnable Window Size Dataset for Bitcoin Market Price [Dataset]. http://doi.org/10.7910/DVN/5YBLKV
    Explore at:
    Dataset updated
    Nov 8, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Rajabi, Shahab
    Description

    The dataset of this paper is collected based on Google, Blockchain, and the Bitcoin market. Generally, there is a total of 26 features, however, a feature whose correlation rate is lower than 0.3 between the variations of price and the variations of feature has been eliminated. Hence, a total of 21 practical features including Market capitalization, Trade-volume, Transaction-fees USD, Average confirmation time, Difficulty, High price, Low price, Total hash rate, Block-size, Miners-revenue, N-transactions-total, Google searches, Open price, N-payments-per Block, Total circulating Bitcoin, Cost-per-transaction percent, Fees-USD-per transaction, N-unique-addresses, N-transactions-per block, and Output-volume have been selected. In addition to the values of these features, for each feature, a new one is created that includes the difference between the previous day and the day before the previous day as a supportive feature. From the point of view of the number and history of the dataset used, a total of 1275 training data were used in the proposed model to extract patterns of Bitcoin price and they were collected from 12 Nov 2018 to 4 Jun 2021.

  11. d

    Replication Data for: Computer-Assisted Keyword and Document Set Discovery...

    • search.dataone.org
    Updated Nov 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    King, Gary; Patrick Lam; Margaret E. Roberts (2023). Replication Data for: Computer-Assisted Keyword and Document Set Discovery from Unstructured Text [Dataset]. http://doi.org/10.7910/DVN/FMJDCD
    Explore at:
    Dataset updated
    Nov 21, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    King, Gary; Patrick Lam; Margaret E. Roberts
    Description

    The (unheralded) first step in many applications of automated text analysis involves selecting keywords to choose documents from a large text corpus for further study. Although all substantive results depend on this choice, researchers usually pick keywords in ad hoc ways that are far from optimal and usually biased. Most seem to think that keyword selection is easy, since they do Google searches every day, but we demonstrate that humans perform exceedingly poorly at this basic task. We offer a better approach, one that also can help with following conversations where participants rapidly innovate language to evade authorities, seek political advantage, or express creativity; generic web searching; eDiscovery; look-alike modeling; industry and intelligence analysis; and sentiment and topic analysis. We develop a computer-assisted (as opposed to fully automated or human-only) statistical approach that suggests keywords from available text without needing structured data as inputs. This framing poses the statistical problem in a new way, which leads to a widely applicable algorithm. Our specific approach is based on training classifiers, extracting information from (rather than correcting) their mistakes, and summarizing results with easy-to-understand Boolean search strings. We illustrate how the technique works with analyses of English texts about the Boston Marathon Bombings, Chinese social media posts designed to evade censorship, and others.

  12. Lead Scoring Dataset

    • kaggle.com
    zip
    Updated Aug 17, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amrita Chatterjee (2020). Lead Scoring Dataset [Dataset]. https://www.kaggle.com/amritachatterjee09/lead-scoring-dataset
    Explore at:
    zip(411028 bytes)Available download formats
    Dataset updated
    Aug 17, 2020
    Authors
    Amrita Chatterjee
    Description

    Context

    An education company named X Education sells online courses to industry professionals. On any given day, many professionals who are interested in the courses land on their website and browse for courses.

    The company markets its courses on several websites and search engines like Google. Once these people land on the website, they might browse the courses or fill up a form for the course or watch some videos. When these people fill up a form providing their email address or phone number, they are classified to be a lead. Moreover, the company also gets leads through past referrals. Once these leads are acquired, employees from the sales team start making calls, writing emails, etc. Through this process, some of the leads get converted while most do not. The typical lead conversion rate at X education is around 30%.

    Now, although X Education gets a lot of leads, its lead conversion rate is very poor. For example, if, say, they acquire 100 leads in a day, only about 30 of them are converted. To make this process more efficient, the company wishes to identify the most potential leads, also known as ‘Hot Leads’. If they successfully identify this set of leads, the lead conversion rate should go up as the sales team will now be focusing more on communicating with the potential leads rather than making calls to everyone.

    There are a lot of leads generated in the initial stage (top) but only a few of them come out as paying customers from the bottom. In the middle stage, you need to nurture the potential leads well (i.e. educating the leads about the product, constantly communicating, etc. ) in order to get a higher lead conversion.

    X Education wants to select the most promising leads, i.e. the leads that are most likely to convert into paying customers. The company requires you to build a model wherein you need to assign a lead score to each of the leads such that the customers with higher lead score h have a higher conversion chance and the customers with lower lead score have a lower conversion chance. The CEO, in particular, has given a ballpark of the target lead conversion rate to be around 80%.

    Content

    Variables Description * Prospect ID - A unique ID with which the customer is identified. * Lead Number - A lead number assigned to each lead procured. * Lead Origin - The origin identifier with which the customer was identified to be a lead. Includes API, Landing Page Submission, etc. * Lead Source - The source of the lead. Includes Google, Organic Search, Olark Chat, etc. * Do Not Email -An indicator variable selected by the customer wherein they select whether of not they want to be emailed about the course or not. * Do Not Call - An indicator variable selected by the customer wherein they select whether of not they want to be called about the course or not. * Converted - The target variable. Indicates whether a lead has been successfully converted or not. * TotalVisits - The total number of visits made by the customer on the website. * Total Time Spent on Website - The total time spent by the customer on the website. * Page Views Per Visit - Average number of pages on the website viewed during the visits. * Last Activity - Last activity performed by the customer. Includes Email Opened, Olark Chat Conversation, etc. * Country - The country of the customer. * Specialization - The industry domain in which the customer worked before. Includes the level 'Select Specialization' which means the customer had not selected this option while filling the form. * How did you hear about X Education - The source from which the customer heard about X Education. * What is your current occupation - Indicates whether the customer is a student, umemployed or employed. * What matters most to you in choosing this course An option selected by the customer - indicating what is their main motto behind doing this course. * Search - Indicating whether the customer had seen the ad in any of the listed items. * Magazine
    * Newspaper Article * X Education Forums
    * Newspaper * Digital Advertisement * Through Recommendations - Indicates whether the customer came in through recommendations. * Receive More Updates About Our Courses - Indicates whether the customer chose to receive more updates about the courses. * Tags - Tags assigned to customers indicating the current status of the lead. * Lead Quality - Indicates the quality of lead based on the data and intuition the employee who has been assigned to the lead. * Update me on Supply Chain Content - Indicates whether the customer wants updates on the Supply Chain Content. * Get updates on DM Content - Indicates whether the customer wants updates on the DM Content. * Lead Profile - A lead level assigned to each customer based on their profile. * City - The city of the customer. * Asymmetric Activity Index - An index and score assigned to each customer based on their activity and their profile * Asymmetric Profile Index * Asymmetric Activity Score * Asymmetric Profile Score
    * I agree to pay the amount through cheque - Indicates whether the customer has agreed to pay the amount through cheque or not. * a free copy of Mastering The Interview - Indicates whether the customer wants a free copy of 'Mastering the Interview' or not. * Last Notable Activity - The last notable activity performed by the student.

    Acknowledgements

    UpGrad Case Study

    Inspiration

    Your data will be in front of the world's largest data science community. What questions do you want to see answered?

  13. H

    Capturing the Aftermath of the Dobbs v. Jackson Decision in the Google...

    • dataverse.harvard.edu
    Updated Jan 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brooke Perreault; Anya Wintner; Lan Dau; Eni Mustafaraj (2023). Capturing the Aftermath of the Dobbs v. Jackson Decision in the Google Search Results across 65 U.S. Locations [Dataset]. http://doi.org/10.7910/DVN/YFAH9X
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 17, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Brooke Perreault; Anya Wintner; Lan Dau; Eni Mustafaraj
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Dataset for the paper "Capturing the Aftermath of the Dobbs v. Jackson Decision in the Google Search Results across 65 U.S. Locations" to appear in the proceedings of ICWSM 2023. Starting on the day of the U.S Supreme Court decision to overturn Roe v. Wade, we collected Google Search result pages for 21 days in 65 U.S. locations for a set of almost 1,700 queries. We stored all the SERPs generated by Google. Because the archives containing these SERPs are much larger than the file limits of the Harvard Dataverse, you can find them at this address: https://cs.wellesley.edu/~credlab/icwsm2023/. Instead, in this repository we will share all the files that were created by parsing some of the information in the SERPs: organic search results, top stories, and embedded tweets. We also provide aggregated statistics for the domains appearing in the organic results and the top stories. This dataset can be useful for answering questions about Google Search's algorithms with respect to shaping access to information related to important news events.

  14. f

    Data from: Impact of World Cerebral Palsy Day on Public Interest in Brazil:...

    • tandf.figshare.com
    xlsx
    Updated Nov 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nathalia Caroline Soares Chaves; Adriane Santos de Oliveira; Ana Clara Geremias Fischer; Jessica Paiva Tavares; André Luís Ferreira Meireles (2024). Impact of World Cerebral Palsy Day on Public Interest in Brazil: Evidence from Internet Search Data [Dataset]. http://doi.org/10.6084/m9.figshare.27180082.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Nov 30, 2024
    Dataset provided by
    Taylor & Francis
    Authors
    Nathalia Caroline Soares Chaves; Adriane Santos de Oliveira; Ana Clara Geremias Fischer; Jessica Paiva Tavares; André Luís Ferreira Meireles
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This study investigated the impact of the World Cerebral Palsy Day (WCPD) campaign on the public interest using Google Trends Analysis data in Brazil. Google Trends was used to collect Relative Search Volume (RSV) data for “cerebral palsy” from 2004 to 2011 (control years) and 2012 to 2022 (WCPD years). RSV during the 4 weeks around WCPD (period of interest) was compared with the rest of the year (control period) in each timeframe. Regional RSV, search queries, and main topics were also investigated. RSV increased by 62.22% from pre-campaign to campaign period. During the WCPD years, a 21.36% RSV increase occurred in campaign weeks, with an average difference of 12.16 (95% CI: 1.74, 22.58); notably in in the last five years in the southeast 9.47 (95% CI: 2.93, 16.01) and south 8.66 (95% CI: 1.66, 15.66) macro-regions. The campaign has fulfilled its role, but targeting more vulnerable areas could further amplify its impact.

  15. Ethereum Cryptocurrency

    • console.cloud.google.com
    Updated Apr 7, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    https://console.cloud.google.com/marketplace/browse?filter=partner:Ethereum&hl=pt (2023). Ethereum Cryptocurrency [Dataset]. https://console.cloud.google.com/marketplace/product/ethereum/crypto-ethereum-blockchain?hl=pt
    Explore at:
    Dataset updated
    Apr 7, 2023
    Dataset provided by
    Googlehttp://google.com/
    Description

    Ethereum is a crypto currency which leverages blockchain technology to store transactions in a distributed ledger. A blockchain is an ever-growing "tree" of blocks, where each block contains a number of transactions. To learn more, read the "Ethereum in BigQuery: a Public Dataset for smart contract analytics" blog post by Google Developer Advocate Allen Day. This dataset is part of a larger effort to make cryptocurrency data available in BigQuery through the Google Cloud Public Datasets program . The program is hosting several cryptocurrency datasets, with plans to both expand offerings to include additional cryptocurrencies and reduce the latency of updates. You can find these datasets by searching "cryptocurrency" in GCP Marketplace. For analytics interoperability, we designed a unified schema that allows all Bitcoin-like datasets to share queries. Interested in learning more about how the data from these blockchains were brought into BigQuery? Looking for more ways to analyze the data? Check out the Google Cloud Big Data blog post and try the sample queries below to get started. This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery .

  16. n

    Data from: Recognizing the importance of near-home contact with nature for...

    • data.niaid.nih.gov
    • datasetcatalog.nlm.nih.gov
    • +2more
    zip
    Updated Aug 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Magdalena Lenda; Piotr Skórka; Małgorzata Jaźwa; Hsien-Yung Lin; Edward Nęcka; Piotr Tryjanowski; Dawid Moroń; Johannes M. H. Knops; Hugh P. Possingham (2023). Recognizing the importance of near-home contact with nature for mental well-being based on the COVID-19 lockdown experience [Dataset]. http://doi.org/10.5061/dryad.fn2z34v1h
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 29, 2023
    Dataset provided by
    Xi’an Jiaotong-Liverpool University
    Uniwersytet SWPS
    University of Opole
    The University of Queensland
    University of Life Sciences in Poznań
    Institute of Nature Conservation
    Institute of Systematics and Evolution of Animals
    Carleton University
    Authors
    Magdalena Lenda; Piotr Skórka; Małgorzata Jaźwa; Hsien-Yung Lin; Edward Nęcka; Piotr Tryjanowski; Dawid Moroń; Johannes M. H. Knops; Hugh P. Possingham
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Several urban landscape planning solutions have been introduced around the world to find a balance between developing urban spaces, maintaining and restoring biodiversity, and enhancing quality of human life. Our global mini-review, combined with analysis of big data collected from Google Trends at global scale, reveals the importance of enjoying day-to-day contact with nature and engaging in such activities as nature observation and identification and gardening for the mental well-being of humans during the COVID-19 pandemic. Home-based activities, such as watching birds from one’s window, identifying species of plants and animals, backyard gardening, and collecting information about nature for citizen science projects, were popular during the first lockdown in spring 2020, when people could not easily venture out of their homes. In our mini-review, we found 37 articles from 28 countries with a total sample of 114,466 people. These papers suggest that home-based engagement with nature was an entertaining and pleasant distraction that helped preserve mental well-being during a challenging time. According to Google Trends, interest in such activities increased during lockdown compared to the previous five years. Millions of people worldwide are chronically or temporarily confined to their homes and neighborhoods because of illness, childcare chores, or elderly care responsibility, which makes it difficult for them to travel far to visit such places as national parks, created through land sparing, where people go to enjoy nature and relieve stress. This article posits that for such people, living in an urban landscape designed to facilitate effortless contact with small natural areas is a more effective way to receive the mental health benefits of contact with nature than visiting a sprawling nature park on rare occasions. Methods 1. Identifying the most common types of activities related to nature observation, gardening, and taxa identification during the first lockdown based on scientific articles and non-scientific press For scientific articles, in March 2023 we searched Scopus and Google Scholar. For countries where Google is restricted, such as China, similar results will be available from other scientific browsers, with the highest number of results from our database being available from Scopus. We used the Google Search browser to search for globally published non-scientific press articles. Some selection criteria were applied during article review. Specifically, we excluded articles that were not about the first lockdown; did not study activities at a local scale (from balcony, window, backyard) but rather in areas far away from home (e.g., visiting forests); studied the mental health effect of observing indoor potted plants and pet animals; or transiently mentioned the topic or keyword without going into any scientific detail. We included all papers that met our criteria, that is, studies that analyzed our chosen topic with experiments or planned observations. We included all research papers, but not letters that made claims without any data. Google Scholar automatically screened the title, abstract, keywords, and the whole text of each article for the keywords we entered. All articles that met our criteria were read and double-checked for keywords and content related to the keywords (e.g., synonyms or if they presented content about the relevant topic without using the specific keywords). We identified, from both types of articles, the major nature-based activities that people engaged in during the first lockdown in the spring of 2020. Keywords used in this study were grouped into six main topics: (1) COVID-19 pandemic; (2) nature-oriented activity focused on nature observation, identification of different taxa, or gardening; (3) mental well-being; (4) activities performed from a balcony, window, or in gardens; (5) entertainment; and (6) citizen science (see Table 1 for all keywords). 2. Increase in global trends in interest in nature observation, gardening, and taxa identification during the first lockdown We used the categorical cluster method, which was combined with big data from Google Trends (downloaded on 1 September 2020) and anomaly detection to identify trend anomalies globally in peoples’ interests. We used this combination of methods to examine whether interest in nature-based activities that were mentioned in scientific and nonscientific press articles increased during the first lockdown. Keywords linked with the main types of nature-oriented activities, as identified from press and scientific articles, and used according to the categorical clustering method were classified into the following six main categories: (1) global interest in bird-watching and bird identification combined with citizen science; (2) global interest in plant identification and gardening combined with citizen science; (3) global interest in butterfly watching, (4) local interest in early-spring (lockdown time), summer, or autumn flowering species that usually can be found in Central European (country: Poland) backyards; (5) global interest in traveling and social activities; and (6) global interest in nature areas and activities typically enjoyed during holidays and thus requiring traveling to land-spared nature reserves. The six categories were divided into 15 subcategories so that we could attach relevant words or phrases belonging to the same cluster and typically related to the activity (according to Google Trends and Google browser’s automatic suggestions; e.g., people who searched for “bird-watching” typically also searched for “binoculars,” “bird feeder,” “bird nest,” and “birdhouse”). The subcategories and keywords used for data collection about trends in society’s interest in the studied topic from Google Trends are as follows.

    Bird-watching: “binoculars,” “bird feeder,” “bird nest,” “birdhouse,” “bird-watching”; Bird identification: “bird app,” “bird identification,” “bird identification app,” “bird identifier,” “bird song app”; Bird-watching combined with citizen science: “bird guide,” “bird identification,” “eBird,” “feeding birds,” “iNaturalist”; Citizen science and bird-watching apps: “BirdNET,” “BirdSong ID,” “eBird,” “iNaturalist,” “Merlin Bird ID”; Gardening: “gardening,” “planting,” “seedling,” “seeds,” “soil”; Shopping for gardening: “garden shop,” “plant buy,” “plant ebay,” “plant sell,” “plant shop”; Plant identification apps: “FlowerChecker,” “LeafSnap,” “NatureGate,” “Plantifier,” “PlantSnap”; Citizen science and plant identification: “iNaturalist,” “plant app,” “plant check,” “plant identification app,” “plant identifier”; Flowers that were flowering in gardens during lockdown in Poland: “fiołek” (viola), “koniczyna” (shamrock), “mlecz” (dandelion), “pierwiosnek” (primose), “stokrotka” (daisy). They are typical early-spring flowers growing in the gardens in Central Europe. We had to be more specific in this search because there are no plant species blooming across the world at the same time. These plant species have well-known biology; thus, we could easily interpret these results; Flowers that were not flowering during lockdown in Poland: “chaber” (cornflower), “mak” (poppy), “nawłoć” (goldenrod), “róża” (rose), “rumianek” (chamomile). They are typical mid-summer flowering plants often planted in gardens; Interest in traveling long distances and in social activities that involve many people: “airport,” “bus,” “café,” “driving,” “pub”; Single or mass commuting, and traveling: “bike,” “boat,” “car,” “flight,” “train”; Interest in distant places and activities for visiting natural areas: “forest,” “nature park,” “safari,” “trekking,” “trip”; Places and activities for holidays (typically located far away): “coral reef,” “rainforest,” “safari,” “savanna,” “snorkeling”; Butterfly watching: “butterfly watching,” “butterfly identification,” “butterfly app,” “butterfly net,” “butterfly guide”;

    In Google Trends, we set the following filters: global search, dates: July 2016–July 2020; language: English.

  17. Z

    Query auto-completions for German politicians of the 18th Bundestag

    • data.niaid.nih.gov
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Schaer, Philipp (2020). Query auto-completions for German politicians of the 18th Bundestag [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3462045
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Schaer, Philipp
    Samokhina, Anastasiia
    Heisenberg, Gernot
    Bonart, Malte
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Germany
    Description

    bundestag.csv - UTF-8 encoded comma separated text file

    This dataset contains the members of the 18th German Bundestag in the constitution of late 2016.

    name of the politician

    birthday

    party membership of the politician

    state of the politician

    gender of the politician

    age of the politician (as of 2017)

    number of unique auto-completions assigned to topic: "location information"

    number of unique auto-completions assigned to topic: "personal and emotional"

    number of unique auto-completions assigned to topic: "politics and economics"

    total number of unique auto-completions

    terms.csv - UTF-8 encoded comma separated text file

    This dataset contains the unordered and pooled auto-completions for the German politicians from Bing search (http://api.bing.net/osjson.aspx), from Duck-Duck-Go (https://duckduckgo.com/ac/) and from Google search (http://clients1.google.de/complete/search). The data was crawled on (mostly) two times per day from 2017/02/03 to 2017/06/19. German language settings were used for Google and Bing, English language setting was used for Duck-Duck-Go. The API requests were sent with an IP address from Cologne, Germany.

    google, bing or ddg

    the query term, matches the name of the politican in the file

    the suggested query auto-completion

  18. Top 200 Youtubers Data (cleaned)

    • kaggle.com
    Updated Jul 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Syed Jafer (2022). Top 200 Youtubers Data (cleaned) [Dataset]. https://www.kaggle.com/syedjaferk/top-200-youtubers-cleaned/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 8, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Syed Jafer
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    YouTube is an American online video sharing and social media platform headquartered in San Bruno, California. It was launched on February 14, 2005, by Steve Chen, Chad Hurley, and Jawed Karim. It is owned by Google, and is the second most visited website, after Google Search. YouTube has more than 2.5 billion monthly users who collectively watch more than one billion hours of videos each day. As of May 2019, videos were being uploaded at a rate of more than 500 hours of content per minute.

    Youtube is very much used to influence, educate, free university (for me also) people (the users followers) in a particular way for a specific issue - which can impact the order in some ways.

  19. f

    #WLIC2016 Most Frequent Terms Roundup

    • city.figshare.com
    bin
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ernesto Priego (2023). #WLIC2016 Most Frequent Terms Roundup [Dataset]. http://doi.org/10.6084/m9.figshare.3749367.v2
    Explore at:
    binAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    City, University of London
    Authors
    Ernesto Priego
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    IFLA stands for The International Federation of Library Associations and Institutions. The IFLA World Library and Information Congress 2016 and 2nd IFLA General Conference and Assembly, ‘Connections. Collaboration. Community’ took place 13–19 August 2016 at the Greater Columbus Convention Center (GCCC) in Columbus, Ohio, United States. The official hashtag of the conference was #WLIC2016.This spreadsheet contains the results of a text analysis of 22327 Tweets publicly labeled with #WLIC2016 between Sunday 14 and Thursday 18 August 2015. The collection of the source dataset was made with a Twitter Archiving Google Spreadsheet and the automated text analysis was done with the Terms tool from Voyant Tools. The spreadsheet contains:A sheet containing a table summarising the source archive A sheet containing a table detailing tweet counts per day. Sheets containing the 'raw' (no stop words, no manual refining) tables of top 300 most frequent terms and their counts for the Sun-Thu corpus and each individual corpus (1 per day).Sheets containing the 'edited' (edited English stop word filter applied, manually refined) tables of top 50 Most frequent terms and their counts for the Sun-Thu corpus and each individual corpus (1 per day).A sheet containing a comparison table of the top 50 per day.Other ConsiderationsOnly Tweets published by accounts with at least one follower were included in the source archive.Both research and experience show that the Twitter search API is not 100% reliable. Large Tweet volumes affect the search collection process. The API might "over-represent the more central users", not offering "an accurate picture of peripheral activity" (González-Bailon, Sandra, et al, 2012).Apart from the filters and limitations already declared, it cannot be guaranteed that each and every Tweet tagged with #WLIC2016 during the indicated period was analysed. The dataset was shared for archival, comparative and indicative educational research purposes only.Only content from public accounts, obtained from the Twitter Search API, was analysed. The source data is also publicly available to all Twitter users via the Twitter Search API and available to anyone with an Internet connection via the Twitter and Twitter Search web client and mobile apps without the need of a Twitter account.This file contains the results of analyses of Tweets that were published openly on the Web with the queried hashtag; the source Tweets are not included. The content of the source Tweets is responsibility of the original authors. Original Tweets are likely to be copyright their individual authors but please check individually. This work is shared to archive, document and encourage open educational research into scholarly activity on Twitter. The resulting dataset does not contain complete Tweets nor Twitter metadata. No private personal information was shared. The collection, analysis and sharing of the data has been enabled and allowed by Twitter's Privacy Policy. The sharing of the results complies with Twitter's Developer Rules of the Road. A hashtag is metadata users choose freely to use so their content is associated, directly linked to and categorised with the chosen hashtag. The purpose and function of hashtags is to organise and describe information/outputs under the relevant label in order to enhance the discoverability of the labeled information/outputs (Tweets in this case). Tweets published publicly by scholars or other professionals during academic conferences are often publicly tagged (labeled) with a hashtag dedicated to the conference in question. This practice used to be the confined to a few 'niche' fields; it is increasingly becoming the norm rather than the exception. Though every reason for Tweeters' use of hashtags cannot be generalised nor predicted, it can be argued that scholarly Twitter users form specialised, self-selecting public professional networks that tend to observe scholarly practices and accepted modes of social and professional behaviour. In general terms it can be argued that scholarly Twitter users willingly and consciously tag their public Tweets with a conference hashtag as a means to network and to promote, report from, reflect on, comment on and generally contribute publicly to the scholarly conversation around conferences. As Twitter users, conference Twitter hashtag contributors have agreed to Twitter's Privacy and data sharing policies. Professional associations like the Modern Language Association and the American Pyschological Association recognise Tweets as citeable scholarly outputs. Archiving scholarly Tweets is a means to preserve this form of rapid online scholarship that otherwise can very likely become unretrievable as time passes; Twitter's search API has well-known temporal limitations for retrospective historical search and collection.Beyond individual Tweets as scholarly outputs, the collective scholarly activity on Twitter around a conference or academic project or event can provide interesting insights for the contemporary history of scholarly communications. Though this work has limitations and might not be thoroughly systematic, it is hoped it can contribute to developing new insights into a discipline's public concerns as expressed on Twitter over time.As it is increasingly recommended for data sharing, the CC-0 license has been applied to the resulting output in the repository. It is important however to bear in mind that some terms appearing in the dataset might be licensed individually differently; copyright of the source Tweets -and sometimes of individual terms- belongs to their authors. Authorial/curatorial/collection work has been performed on the shared file as a curated dataset resulting from analysis, in order to make it available as part of the scholarly record. If this dataset is consulted attribution is always welcome.Ideally for proper reproducibility and to encourage other studies the whole archive dataset should be available. Those wishing to obtain the whole Tweets should still be able to get them themselves via text and data mining methods.

  20. f

    Data_Sheet_1_European national health plans and the monitoring of online...

    • frontiersin.figshare.com
    pdf
    Updated Jun 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Irene Bosch-Frigola; Fernando Coca-Villalba; María José Pérez-Lacasta; Misericòrdia Carles-Lavila (2023). Data_Sheet_1_European national health plans and the monitoring of online searches for information on diabetes mellitus in different European healthcare systems.PDF [Dataset]. http://doi.org/10.3389/fpubh.2022.1023404.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 12, 2023
    Dataset provided by
    Frontiers
    Authors
    Irene Bosch-Frigola; Fernando Coca-Villalba; María José Pérez-Lacasta; Misericòrdia Carles-Lavila
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Diabetes mellitus (DM) is a serious non-communicable disease (NCD) and relies on the patient being aware of their condition, proactive, and having adequate medical care. European countries healthcare models are aware of the impact of these variables. This study evaluates the impact of online health information seeking behavior (OHISB) during World Diabetes Mellitus Day (WDMD) in European countries from 2014 to 2019 by grouping countries according to the changes in citizens' search behavior, diabetes mellitus prevalence, the existence of National Health Plans (NHP), and their respective healthcare systems. We extracted data from Global Burden of Disease, Google Trends (GT), Public Health European Commission, European Coalition for Diabetes, and the Spanish Ministry of Health. First, we used the broken-line models to analyze significant changes in search trends (GT) in European Union member countries in the 30-day intervals before and after the WDMD (November 14) from 2014 to 2019. Then the results obtained were used in the second phase to group these countries by factor analysis of mixed data (FAMD) using the prevalence of DM, the existence of NHP, and health models in each country. The calculations were processed using R software (gtrendsR, segmented, Factoextra, and FactoMineR). We established changes in search trends before and after WDMD, highlighting unevenness among European countries. However, significant changes were mostly observed among countries with NHP. These changes in search trends, in addition to being significant, were reiterated over time and occurred especially in countries belonging to the Beveridge Model (Portugal, Spain, and Sweden) and with NHPs in place. Greater awareness of diabetes mellitus among the population and continuous improvements in NHP can improve the patients' quality of life, thus impacting in disease management and healthcare expenditure.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Alessandro Rovetta (2023). Table1_Reliability of Google Trends: Analysis of the Limits and Potential of Web Infoveillance During COVID-19 Pandemic and for Future Research.DOCX [Dataset]. http://doi.org/10.3389/frma.2021.670226.s001

Table1_Reliability of Google Trends: Analysis of the Limits and Potential of Web Infoveillance During COVID-19 Pandemic and for Future Research.DOCX

Related Article
Explore at:
docxAvailable download formats
Dataset updated
May 30, 2023
Dataset provided by
Frontiers
Authors
Alessandro Rovetta
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Background: Alongside the COVID-19 pandemic, government authorities around the world have had to face a growing infodemic capable of causing serious damages to public health and economy. In this context, the use of infoveillance tools has become a primary necessity.Objective: The aim of this study is to test the reliability of a widely used infoveillance tool which is Google Trends. In particular, the paper focuses on the analysis of relative search volumes (RSVs) quantifying their dependence on the day they are collected.Methods: RSVs of the query coronavirus + covid during February 1—December 4, 2020 (period 1), and February 20—May 18, 2020 (period 2), were collected daily by Google Trends from December 8 to 27, 2020. The survey covered Italian regions and cities, and countries and cities worldwide. The search category was set to all categories. Each dataset was analyzed to observe any dependencies of RSVs from the day they were gathered. To do this, by calling i the country, region, or city under investigation and j the day its RSV was collected, a Gaussian distribution Xi=X(σi,x¯i) was used to represent the trend of daily variations of xij=RSVsij. When a missing value was revealed (anomaly), the affected country, region or city was excluded from the analysis. When the anomalies exceeded 20% of the sample size, the whole sample was excluded from the statistical analysis. Pearson and Spearman correlations between RSVs and the number of COVID-19 cases were calculated day by day thus to highlight any variations related to the day RSVs were collected. Welch’s t-test was used to assess the statistical significance of the differences between the average RSVs of the various countries, regions, or cities of a given dataset. Two RSVs were considered statistical confident when t

Search
Clear search
Close search
Google apps
Main menu