OpenWeb Ninja's Google Images Data (Google SERP Data) API provides real-time image search capabilities for images sourced from all public sources on the web.
The API enables you to search and access more than 100 billion images from across the web including advanced filtering capabilities as supported by Google Advanced Image Search. The API provides Google Images Data (Google SERP Data) including details such as image URL, title, size information, thumbnail, source information, and more data points. The API supports advanced filtering and options such as file type, image color, usage rights, creation time, and more. In addition, any Advanced Google Search operators can be used with the API.
OpenWeb Ninja's Google Images Data & Google SERP Data API common use cases:
Creative Media Production: Enhance digital content with a vast array of real-time images, ensuring engaging and brand-aligned visuals for blogs, social media, and advertising.
AI Model Enhancement: Train and refine AI models with diverse, annotated images, improving object recognition and image classification accuracy.
Trend Analysis: Identify emerging market trends and consumer preferences through real-time visual data, enabling proactive business decisions.
Innovative Product Design: Inspire product innovation by exploring current design trends and competitor products, ensuring market-relevant offerings.
Advanced Search Optimization: Improve search engines and applications with enriched image datasets, providing users with accurate, relevant, and visually appealing search results.
OpenWeb Ninja's Annotated Imagery Data & Google SERP Data Stats & Capabilities:
100B+ Images: Access an extensive database of over 100 billion images.
Images Data from all Public Sources (Google SERP Data): Benefit from a comprehensive aggregation of image data from various public websites, ensuring a wide range of sources and perspectives.
Extensive Search and Filtering Capabilities: Utilize advanced search operators and filters to refine image searches by file type, color, usage rights, creation time, and more, making it easy to find exactly what you need.
Rich Data Points: Each image comes with more than 10 data points, including URL, title (annotation), size information, thumbnail, and source information, providing a detailed context for each image.
Welcome to Apiscrapy, your ultimate destination for comprehensive location-based intelligence. As an AI-driven web scraping and automation platform, Apiscrapy excels in converting raw web data into polished, ready-to-use data APIs. With a unique capability to collect Google Address Data, Google Address API, Google Location API, Google Map, and Google Location Data with 100% accuracy, we redefine possibilities in location intelligence.
Key Features:
Unparalleled Data Variety: Apiscrapy offers a diverse range of address-related datasets, including Google Address Data and Google Location Data. Whether you seek B2B address data or detailed insights for various industries, we cover it all.
Integration with Google Address API: Seamlessly integrate our datasets with the powerful Google Address API. This collaboration ensures not just accessibility but a robust combination that amplifies the precision of your location-based insights.
Business Location Precision: Experience a new level of precision in business decision-making with our address data. Apiscrapy delivers accurate and up-to-date business locations, enhancing your strategic planning and expansion efforts.
Tailored B2B Marketing: Customize your B2B marketing strategies with precision using our detailed B2B address data. Target specific geographic areas, refine your approach, and maximize the impact of your marketing efforts.
Use Cases:
Location-Based Services: Companies use Google Address Data to provide location-based services such as navigation, local search, and location-aware advertisements.
Logistics and Transportation: Logistics companies utilize Google Address Data for route optimization, fleet management, and delivery tracking.
E-commerce: Online retailers integrate address autocomplete features powered by Google Address Data to simplify the checkout process and ensure accurate delivery addresses.
Real Estate: Real estate agents and property websites leverage Google Address Data to provide accurate property listings, neighborhood information, and proximity to amenities.
Urban Planning and Development: City planners and developers utilize Google Address Data to analyze population density, traffic patterns, and infrastructure needs for urban planning and development projects.
Market Analysis: Businesses use Google Address Data for market analysis, including identifying target demographics, analyzing competitor locations, and selecting optimal locations for new stores or offices.
Geographic Information Systems (GIS): GIS professionals use Google Address Data as a foundational layer for mapping and spatial analysis in fields such as environmental science, public health, and natural resource management.
Government Services: Government agencies utilize Google Address Data for census enumeration, voter registration, tax assessment, and planning public infrastructure projects.
Tourism and Hospitality: Travel agencies, hotels, and tourism websites incorporate Google Address Data to provide location-based recommendations, itinerary planning, and booking services for travelers.
Discover the difference with Apiscrapy – where accuracy meets diversity in address-related datasets, including Google Address Data, Google Address API, Google Location API, and more. Redefine your approach to location intelligence and make data-driven decisions with confidence. Revolutionize your business strategies today!
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
The Open Targets Platform is a comprehensive data integration tool that supports systematic identification and prioritisation of potential therapeutic drug targets. By integrating publicly available datasets including data generated by the Open Targets consortium, the Platform builds and scores target-disease associations to assist in drug target identification and prioritisation. It also integrates relevant annotation information about targets, diseases or phenotypes, variants, GWAS and molQTL studies, credible sets and drugs - as well as their most relevant relationships. The Platform is a freely available resource that is actively maintained with quarterly data updates. Data is available through an intuitive user interface, an API, and data downloads. The pipeline and infrastructure codebases are open-source and the licence allows the creation of self-hosted private instances of the Platform with custom data. To learn more about the Platform, visit our Platform documentation or join the Open Targets Community . This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery .
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Cannabis is a genus of flowering plants in the family Cannabaceae.
Source: https://en.wikipedia.org/wiki/Cannabis
In October 2016, Phylos Bioscience released a genomic open dataset of approximately 850 strains of Cannabis via the Open Cannabis Project. In combination with other genomics datasets made available by Courtagen Life Sciences, Michigan State University, NCBI, Sunrise Medicinal, University of Calgary, University of Toronto, and Yunnan Academy of Agricultural Sciences, the total amount of publicly available data exceeds 1,000 samples taken from nearly as many unique strains.
These data were retrieved from the National Center for Biotechnology Information’s Sequence Read Archive (NCBI SRA), processed using the BWA aligner and FreeBayes variant caller, indexed with the Google Genomics API, and exported to BigQuery for analysis. Data are available directly from Google Cloud Storage at gs://gcs-public-data--genomics/cannabis, as well as via the Google Genomics API as dataset ID 918853309083001239, and an additional duplicated subset of only transcriptome data as dataset ID 94241232795910911, as well as in the BigQuery dataset bigquery-public-data:genomics_cannabis.
All tables in the Cannabis Genomes Project dataset have a suffix like _201703. The suffix is referred to as [BUILD_DATE] in the descriptions below. The dataset is updated frequently as new releases become available.
The following tables are included in the Cannabis Genomes Project dataset:
Sample_info contains fields extracted for each SRA sample, including the SRA sample ID and other data that give indications about the type of sample. Sample types include: strain, library prep methods, and sequencing technology. See SRP008673 for an example of upstream sample data. SRP008673 is the University of Toronto sequencing of Cannabis Sativa subspecies Purple Kush.
MNPR01_reference_[BUILD_DATE] contains reference sequence names and lengths for the draft assembly of Cannabis Sativa subspecies Cannatonic produced by Phylos Bioscience. This table contains contig identifiers and their lengths.
MNPR01_[BUILD_DATE] contains variant calls for all included samples and types (genomic, transcriptomic) aligned to the MNPR01_reference_[BUILD_DATE] table. Samples can be found in the sample_info table. The MNPR01_[BUILD_DATE] table is exported using the Google Genomics BigQuery variants schema. This table is useful for general analysis of the Cannabis genome.
MNPR01_transcriptome_[BUILD_DATE] is similar to the MNPR01_[BUILD_DATE] table, but it includes only the subset transcriptomic samples. This table is useful for transcribed gene-level analysis of the Cannabis genome.
Fork this kernel to get started with this dataset.
Dataset Source: http://opencannabisproject.org/ Category: Genomics Use: This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - https://www.ncbi.nlm.nih.gov/home/about/policies.shtml - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset. Update frequency: As additional data are released to GenBank View in BigQuery: https://bigquery.cloud.google.com/dataset/bigquery-public-data:genomics_cannabis View in Google Cloud Storage: gs://gcs-public-data--genomics/cannabis
Banner Photo by Rick Proctor from Unplash.
Which Cannabis samples are included in the variants table?
Which contigs in the MNPR01_reference_[BUILD_DATE] table have the highest density of variants?
How many variants does each sample have at the THC Synthase gene (THCA1) locus?
APISCRAPY, your premier provider of Map Data solutions. Map Data encompasses various information related to geographic locations, including Google Map Data, Location Data, Address Data, and Business Location Data. Our advanced Google Map Data Scraper sets us apart by extracting comprehensive and accurate data from Google Maps and other platforms.
What sets APISCRAPY's Map Data apart are its key benefits:
Accuracy: Our scraping technology ensures the highest level of accuracy, providing reliable data for informed decision-making. We employ advanced algorithms to filter out irrelevant or outdated information, ensuring that you receive only the most relevant and up-to-date data.
Accessibility: With our data readily available through APIs, integration into existing systems is seamless, saving time and resources. Our APIs are easy to use and well-documented, allowing for quick implementation into your workflows. Whether you're a developer building a custom application or a business analyst conducting market research, our APIs provide the flexibility and accessibility you need.
Customization: We understand that every business has unique needs and requirements. That's why we offer tailored solutions to meet specific business needs. Whether you need data for a one-time project or ongoing monitoring, we can customize our services to suit your needs. Our team of experts is always available to provide support and guidance, ensuring that you get the most out of our Map Data solutions.
Our Map Data solutions cater to various use cases:
B2B Marketing: Gain insights into customer demographics and behavior for targeted advertising and personalized messaging. Identify potential customers based on their geographic location, interests, and purchasing behavior.
Logistics Optimization: Utilize Location Data to optimize delivery routes and improve operational efficiency. Identify the most efficient routes based on factors such as traffic patterns, weather conditions, and delivery deadlines.
Real Estate Development: Identify prime locations for new ventures using Business Location Data for market analysis. Analyze factors such as population density, income levels, and competition to identify opportunities for growth and expansion.
Geospatial Analysis: Leverage Map Data for spatial analysis, urban planning, and environmental monitoring. Identify trends and patterns in geographic data to inform decision-making in areas such as land use planning, resource management, and disaster response.
Retail Expansion: Determine optimal locations for new stores or franchises using Location Data and Address Data. Analyze factors such as foot traffic, proximity to competitors, and demographic characteristics to identify locations with the highest potential for success.
Competitive Analysis: Analyze competitors' business locations and market presence for strategic planning. Identify areas of opportunity and potential threats to your business by analyzing competitors' geographic footprint, market share, and customer demographics.
Experience the power of APISCRAPY's Map Data solutions today and unlock new opportunities for your business. With our accurate and accessible data, you can make informed decisions, drive growth, and stay ahead of the competition.
[ Related tags: Map Data, Google Map Data, Google Map Data Scraper, B2B Marketing, Location Data, Map Data, Google Data, Location Data, Address Data, Business location data, map scraping data, Google map data extraction, Transport and Logistic Data, Mobile Location Data, Mobility Data, and IP Address Data, business listings APIs, map data, map datasets, map APIs, poi dataset, GPS, Location Intelligence, Retail Site Selection, Sentiment Analysis, Marketing Data Enrichment, Point of Interest (POI) Mapping]
USPTO Patent Examiner Data System (PEDS) API Data contains data from the examination process of USPTO patent applications. PEDS contains the bibliographic, published document and patent term extension data tabs in Public PAIR from 1981 to present. There is also some data dating back to 1935.
Fast and Reliable real-time API access to global public event data from Google Events - the largest public event data aggregate on the web. The API provides data for local/physical events and online/virtual events.
https://brightdata.com/licensehttps://brightdata.com/license
The Google Maps dataset is ideal for getting extensive information on businesses anywhere in the world. Easily filter by location, business type, and other factors to get the exact data you need. The Google Maps dataset includes all major data points: timestamp, name, category, address, description, open website, phone number, open_hours, open_hours_updated, reviews_count, rating, main_image, reviews, url, lat, lon, place_id, country, and more.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Patent Examination Data System gives users access to multiple records of USPTO patent application or patent filing status at no cost. PEDS is updated daily and mirrors the data available in the Patent Application Location and Monitoring system (PALM). PEDS provides access to public applications including: published patent applications and patents. PCT applications that have not been published by WIPO. Any applications that have not been released by the USPTO will not be available in PEDS.
USPTO Patent Examiner Data System (PEDS) API Data contains data from the examination process of USPTO patent applications. PEDS contains the bibliographic, published document and patent term extension data tabs in Public PAIR from 1981 to present. There is also some data dating back to 1935.
Fork this notebook to get started on accessing data in the BigQuery dataset using the BQhelper package to write SQL queries.
"Patent Examination Data System" by the USPTO, for public use.
Data Origin: https://bigquery.cloud.google.com/dataset/patents-public-data:uspto_peds
Banner photo by Thought Catalog on Unsplash
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Machine learning (ML) methods enable prediction of the properties of chemical structures without computationally expensive ab initio calculations. The quality of such predictions depends on the reference data that was used to train the model. In this work, we introduce the QCML dataset: A comprehensive dataset for training ML models for quantum chemistry. The QCML dataset systematically covers chemical space with small molecules consisting of up to 8 heavy atoms and includes elements from a large fraction of the periodic table, as well as different electronic states. Starting from chemical graphs, conformer search and normal mode sampling are used to generate both equilibrium and off-equilibrium 3D structures, for which various properties are calculated with semi-empirical methods (14.7 billion entries) and density functional theory (33.5 million entries). The covered properties include energies, forces, multipole moments, and other quantities, e.g. Kohn-Sham matrices. We provide a first demonstration of the utility of our dataset by training ML-based force fields on the data and applying them to run molecular dynamics simulations.
The data is available as TensorFlow dataset (TFDS) and can be accessed from the publicly available Google Cloud Storage at gs://qcml-datasets/tfds/. (See "Directory structure" below.)
For information on different access options (command-line tools, client libraries, etc), please see https://cloud.google.com/storage/docs/access-public-data.
Directory structure
Builder configurations
Format: Builder config name: number of shards (rounded total size)
Semi-empirical calculations:
DFT calculations:
Real-time API access to rich Job Postings Data with 200M+ job postings & Salary Data sourced from Google for Jobs - global aggregate of LinkedIn, Indeed, Glassdoor, ZipRecruiter, and all public job sites across the web.
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The Open Banking API market is experiencing robust growth, driven by increasing digitalization, stringent regulatory frameworks promoting data sharing, and rising consumer demand for personalized financial services. The market, estimated at $15 billion in 2025, is projected to achieve a Compound Annual Growth Rate (CAGR) of 25% from 2025 to 2033, reaching approximately $70 billion by 2033. This significant expansion is fueled by the proliferation of innovative financial products and services enabled by Open Banking, such as personalized lending, improved fraud detection, and streamlined account aggregation. The Payment Initiation Service API (PISP) segment currently holds the largest market share, due to its wide applicability across various sectors and ease of integration for both businesses and consumers. However, the Account Information Service API (AISP) segment is expected to witness substantial growth in the coming years, driven by increasing demand for enhanced financial management tools and data-driven insights. Key players such as Google, Mastercard, Plaid, and TrueLayer are actively shaping the market landscape through strategic partnerships, technological advancements, and aggressive expansion strategies. Geographic segmentation reveals North America and Europe as currently dominant regions, driven by early adoption of Open Banking regulations and a mature fintech ecosystem. However, the Asia-Pacific region is poised for rapid growth, owing to burgeoning digital adoption and increasing government support for financial technology innovation. While challenges exist, including data security concerns and the need for standardized APIs, the overall market outlook remains exceptionally positive, with significant potential for continued expansion and diversification across various applications and geographical markets. The increasing focus on regulatory compliance and data privacy will play a crucial role in shaping market dynamics in the coming years.
OpenWeb Ninja's Public Event Data API provides fast, reliable, and real-time access to any public event including sport events, concerts, workshops, festivals, movies, and more types supported by Google Events.
The OpenWeb Ninja's Public Event Data API sources the data from Google Events - a global aggregate that sources events from any trusted public source on the web. The API provides comprehensive event data, featuring over 30 data points per event. This includes start and end times, ticket links, location and detailed venue information, and additional details.
OpenWeb Ninja's Public Event Data common use cases: - Event Analytics and Insights - Event Discovery Platforms - Travel and Tourism Websites - Smart City Applications - Marketing and Promotions
OpenWeb Ninja's Public Event Data Stats & Capabilities: - 30+ data points per event - Global aggregate - Extensive event venue details - All public event types - Capabilities such online events and date filters
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
The Synthea Generated Synthetic Data in FHIR hosts over 1 million synthetic patient records generated using Synthea in FHIR format. Exported from the Google Cloud Healthcare API FHIR Store into BigQuery using analytics schema . This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery . This public dataset is also available in Google Cloud Storage and available free to use. The URL for the GCS bucket is gs://gcp-public-data--synthea-fhir-data-1m-patients. Use this quick start guide to quickly learn how to access public datasets on Google Cloud Storage. Please cite SyntheaTM as: Jason Walonoski, Mark Kramer, Joseph Nichols, Andre Quina, Chris Moesel, Dylan Hall, Carlton Duffett, Kudakwashe Dube, Thomas Gallagher, Scott McLachlan, Synthea: An approach, method, and software mechanism for generating synthetic patients and the synthetic electronic health care record, Journal of the American Medical Informatics Association, Volume 25, Issue 3, March 2018, Pages 230–238, https://doi.org/10.1093/jamia/ocx079
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
https://pro.europeana.eu/ Europeana.eu serves as an interface to over 40 million books, paintings, films, artifacts and archival material from all over Europe. Around 2,200 European institutions have contributed to Europeana, ranging from major international names such as the Rijksmuseum in Amsterdam, the British Library and the Louvre Museum, to regional archives and local museums.
the Mona Lisa by Leonardo da Vinci, the works of Charles Darwin https://www.europeana.eu/portal/en/search?q=%22Charles+darwin%22 and Isaac newton https://www.europeana.eu/portal/en/search?q=%22Isaac+newton%22 and the sound of Wolfgang Amadeus Mozart are some of the highlights of Europeana.
Europeana has introduced a standardised system that uses Creative Commons licenses and Europeana rights declarations. This means that all metadata (content descriptions such as name, date, type, theme, etc.) on Europenana are offered under an open Creative Commons license Creative Commons CC0 1.0 Universal Public Domain Dedication. When it comes to digital objects, e.g. stamp images, contributors have decided how accessible they should be. The images may have a completely open license or have limited access to the digital objects. The license information about digital objects can be found in the metadata field. https://www.europeana.eu/portal/da/rights/terms.html
Europeana has an open API that delivers data in JSON format. Here you can request an API key: https://pro.europeana.eu/resources/apis/intro See also https://www.europeana.eu/portal/da/rights/api.html.
The license (i.e. rights to reuse) is at the post level, and will thus vary from post to post in the data set.
Europeana Forum for the API: https://groups.google.com/forum/?pli=1#!forum/europeanaapi
Our Price Paid Data includes information on all property sales in England and Wales that are sold for value and are lodged with us for registration.
Get up to date with the permitted use of our Price Paid Data:
check what to consider when using or publishing our Price Paid Data
If you use or publish our Price Paid Data, you must add the following attribution statement:
Contains HM Land Registry data © Crown copyright and database right 2021. This data is licensed under the Open Government Licence v3.0.
Price Paid Data is released under the http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/" class="govuk-link">Open Government Licence (OGL). You need to make sure you understand the terms of the OGL before using the data.
Under the OGL, HM Land Registry permits you to use the Price Paid Data for commercial or non-commercial purposes. However, OGL does not cover the use of third party rights, which we are not authorised to license.
Price Paid Data contains address data processed against Ordnance Survey’s AddressBase Premium product, which incorporates Royal Mail’s PAF® database (Address Data). Royal Mail and Ordnance Survey permit your use of Address Data in the Price Paid Data:
If you want to use the Address Data in any other way, you must contact Royal Mail. Email address.management@royalmail.com.
The following fields comprise the address data included in Price Paid Data:
The June 2025 release includes:
As we will be adding to the June data in future releases, we would not recommend using it in isolation as an indication of market or HM Land Registry activity. When the full dataset is viewed alongside the data we’ve previously published, it adds to the overall picture of market activity.
Your use of Price Paid Data is governed by conditions and by downloading the data you are agreeing to those conditions.
Google Chrome (Chrome 88 onwards) is blocking downloads of our Price Paid Data. Please use another internet browser while we resolve this issue. We apologise for any inconvenience caused.
We update the data on the 20th working day of each month. You can download the:
These include standard and additional price paid data transactions received at HM Land Registry from 1 January 1995 to the most current monthly data.
Your use of Price Paid Data is governed by conditions and by downloading the data you are agreeing to those conditions.
The data is updated monthly and the average size of this file is 3.7 GB, you can download:
Field Name
Description
StateName
Name of the state (Oklahoma)
date
Date of the data point (YYYY-MM-DD)
covid-19_OK
The search interest in the term "COVID-19" in Oklahoma on the given date
sars-cov-2_OK
The search interest in the term "SARS-CoV-2" in Oklahoma on the given date
coronavirus_OK
The search interest in the term "coronavirus" in Oklahoma on the given date
Omicron_OK
The search interest in the term "Omicron" in Oklahoma on the given date
Delta_OK
The search interest in the term "Delta" in Oklahoma on the given date
Fever_OK
The search interest in the term "fever" in Oklahoma on the given date
fatigue_OK
The search interest in the term "fatigue" in Oklahoma on the given date
diarrhea_OK
The search interest in the term "diarrhea" in Oklahoma on the given date
pneumonia_OK
The search interest in the term "pneumonia" in Oklahoma on the given date
sore throat_OK
The search interest in the term "sore throat" in Oklahoma on the given date
loss of smell_OK
The search interest in the term "loss of smell" in Oklahoma on the given date
loss smell_OK
Another variation for tracking the search interest in "loss of smell" in Oklahoma on the given date
loss taste_OK
The search interest in the term "loss of taste" in Oklahoma on the given date
cough_OK
The search interest in the term "cough" in Oklahoma on the given date
nasal congestion_OK
The search interest in the term "nasal congestion" in Oklahoma on the given date
Pytrends is an unofficial Google Trends API for Python. It enables users to programmatically fetch Google Trends data, which can be useful for various applications such as market research, academic studies, and tracking public interest in specific topics over time. Benefits of Using Pytrends: Automated Data Collection: Pytrends allows for automated and repeatable data collection from Google Trends, saving time and effort compared to manual extraction.
Customizable Queries: Users can specify keywords, timeframes, geographic locations, and other parameters to tailor the data to their specific needs.
Integration with Data Analysis Tools: Pytrends data can be easily integrated with tools like pandas for further analysis, visualization, and reporting.
Real-Time Insights: By regularly updating and analyzing Google Trends data, users can gain real-time insights into public interest and behavior, which is valuable for decision-making and research.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Please cite this paper when using this dataset: N. Thakur, “Mpox narrative on Instagram: A labeled multilingual dataset of Instagram posts on mpox for sentiment, hate speech, and anxiety analysis,” arXiv [cs.LG], 2024, URL: https://arxiv.org/abs/2409.05292Abstract: The world is currently experiencing an outbreak of mpox, which has been declared a Public Health Emergency of International Concern by WHO. During recent virus outbreaks, social media platforms have played a crucial role in keeping the global population informed and updated regarding various aspects of the outbreaks. As a result, in the last few years, researchers from different disciplines have focused on the development of social media datasets focusing on different virus outbreaks. No prior work in this field has focused on the development of a dataset of Instagram posts about the mpox outbreak. The work presented in this paper (stated above) aims to address this research gap. It presents this multilingual dataset of 60,127 Instagram posts about mpox, published between July 23, 2022, and September 5, 2024. This dataset contains Instagram posts about mpox in 52 languages.For each of these posts, the Post ID, Post Description, Date of publication, language, and translated version of the post (translation to English was performed using the Google Translate API) are presented as separate attributes in the dataset. After developing this dataset, sentiment analysis, hate speech detection, and anxiety or stress detection were also performed. This process included classifying each post intoone of the fine-grain sentiment classes, i.e., fear, surprise, joy, sadness, anger, disgust, or neutralhate or not hateanxiety/stress detected or no anxiety/stress detected.These results are presented as separate attributes in the dataset for the training and testing of machine learning algorithms for sentiment, hate speech, and anxiety or stress detection, as well as for other applications.The 52 distinct languages in which Instagram posts are present in the dataset are English, Portuguese, Indonesian, Spanish, Korean, French, Hindi, Finnish, Turkish, Italian, German, Tamil, Urdu, Thai, Arabic, Persian, Tagalog, Dutch, Catalan, Bengali, Marathi, Malayalam, Swahili, Afrikaans, Panjabi, Gujarati, Somali, Lithuanian, Norwegian, Estonian, Swedish, Telugu, Russian, Danish, Slovak, Japanese, Kannada, Polish, Vietnamese, Hebrew, Romanian, Nepali, Czech, Modern Greek, Albanian, Croatian, Slovenian, Bulgarian, Ukrainian, Welsh, Hungarian, and Latvian.The following is a description of the attributes present in this dataset:Post ID: Unique ID of each Instagram postPost Description: Complete description of each post in the language in which it was originally publishedDate: Date of publication in MM/DD/YYYY formatLanguage: Language of the post as detected using the Google Translate APITranslated Post Description: Translated version of the post description. All posts which were not in English were translated into English using the Google Translate API. No language translation was performed for English posts.Sentiment: Results of sentiment analysis (using the preprocessed version of the translated Post Description) where each post was classified into one of the sentiment classes: fear, surprise, joy, sadness, anger, disgust, and neutralHate: Results of hate speech detection (using the preprocessed version of the translated Post Description) where each post was classified as hate or not hateAnxiety or Stress: Results of anxiety or stress detection (using the preprocessed version of the translated Post Description) where each post was classified as stress/anxiety detected or no stress/anxiety detected.All the Instagram posts that were collected during this data mining process to develop this dataset were publicly available on Instagram and did not require a user to log in to Instagram to view the same (at the time of writing this paper).
The main objective of the "Accessible Vienna" application is to support citizens or visitors with special needs in the city of Vienna. It is not only the infrastructure of the city that is important for their daily activities but also the information about the various places (e.g. restaurants, cafés, theaters) and public facilities (e.g. parking places, subway stations, etc.). "Accessible Vienna" is combining the Open Government Data of Vienna with the Google Places Data in order to materialize the main idea of the application. In this respect, it gathers information from the municipal data about parking spaces that are designed for people with disabilities, subway stations with elevators, accessible restaurants, cafés and theaters. This data is linked with data retrieved from the Google Places API regarding details about the accessible places (i.e. photos, ratings, website, Google+ and opening hours). Thus, the user is enabled to both choose an accessible place like any citizen by checking photos, ratings and other venue related information, and find information about the availability of the required public services (parking spaces, accessible subway stations) to reach the destination.
Company Datasets for valuable business insights!
Discover new business prospects, identify investment opportunities, track competitor performance, and streamline your sales efforts with comprehensive Company Datasets.
These datasets are sourced from top industry providers, ensuring you have access to high-quality information:
We provide fresh and ready-to-use company data, eliminating the need for complex scraping and parsing. Our data includes crucial details such as:
You can choose your preferred data delivery method, including various storage options, delivery frequency, and input/output formats.
Receive datasets in CSV, JSON, and other formats, with storage options like AWS S3 and Google Cloud Storage. Opt for one-time, monthly, quarterly, or bi-annual data delivery.
With Oxylabs Datasets, you can count on:
Pricing Options:
Standard Datasets: choose from various ready-to-use datasets with standardized data schemas, priced from $1,000/month.
Custom Datasets: Tailor datasets from any public web domain to your unique business needs. Contact our sales team for custom pricing.
Experience a seamless journey with Oxylabs:
Unlock the power of data with Oxylabs' Company Datasets and supercharge your business insights today!
OpenWeb Ninja's Google Images Data (Google SERP Data) API provides real-time image search capabilities for images sourced from all public sources on the web.
The API enables you to search and access more than 100 billion images from across the web including advanced filtering capabilities as supported by Google Advanced Image Search. The API provides Google Images Data (Google SERP Data) including details such as image URL, title, size information, thumbnail, source information, and more data points. The API supports advanced filtering and options such as file type, image color, usage rights, creation time, and more. In addition, any Advanced Google Search operators can be used with the API.
OpenWeb Ninja's Google Images Data & Google SERP Data API common use cases:
Creative Media Production: Enhance digital content with a vast array of real-time images, ensuring engaging and brand-aligned visuals for blogs, social media, and advertising.
AI Model Enhancement: Train and refine AI models with diverse, annotated images, improving object recognition and image classification accuracy.
Trend Analysis: Identify emerging market trends and consumer preferences through real-time visual data, enabling proactive business decisions.
Innovative Product Design: Inspire product innovation by exploring current design trends and competitor products, ensuring market-relevant offerings.
Advanced Search Optimization: Improve search engines and applications with enriched image datasets, providing users with accurate, relevant, and visually appealing search results.
OpenWeb Ninja's Annotated Imagery Data & Google SERP Data Stats & Capabilities:
100B+ Images: Access an extensive database of over 100 billion images.
Images Data from all Public Sources (Google SERP Data): Benefit from a comprehensive aggregation of image data from various public websites, ensuring a wide range of sources and perspectives.
Extensive Search and Filtering Capabilities: Utilize advanced search operators and filters to refine image searches by file type, color, usage rights, creation time, and more, making it easy to find exactly what you need.
Rich Data Points: Each image comes with more than 10 data points, including URL, title (annotation), size information, thumbnail, and source information, providing a detailed context for each image.