29 datasets found

yelp_review_full
huggingface.co
Updated Mar 6, 2012
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yelp (2012). yelp_review_full [Dataset]. https://huggingface.co/datasets/Yelp/yelp_review_full
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 6, 2012
Dataset authored and provided by
Yelphttp://yelp.com/
License
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Description
Dataset Card for YelpReviewFull

Dataset Summary

The Yelp reviews dataset consists of reviews from Yelp. It is extracted from the Yelp Dataset Challenge 2015 data.

Supported Tasks and Leaderboards

text-classification, sentiment-classification: The dataset is mainly used for text classification: given the text, predict the sentiment.

Languages

The reviews were mainly written in english.

Dataset Structure Data Instances

A… See the full description on the dataset page: https://huggingface.co/datasets/Yelp/yelp_review_full.
Yelp dataset
kaggle.com
Updated Feb 14, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MyrnaMFL (2020). Yelp dataset [Dataset]. https://www.kaggle.com/datasets/fireballbyedimyrnmom/yelp-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 14, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
MyrnaMFL
License
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Description
Dataset

This dataset was created by PrivacyMatters

Released under Database: Open Database, Contents: © Original Authors

Contents
Yelp Open Dataset
live.european-language-grid.eu
json
Updated Dec 30, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yelp (2015). Yelp Open Dataset [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/5179
Explore at:
jsonAvailable download formats
Dataset updated
Dec 30, 2015
Dataset authored and provided by
Yelphttp://yelp.com/
License
https://s3-media0.fl.yelpcdn.com/assets/srv0/engineering_pages/bea5c1e92bf3/assets/vendor/yelp-dataset-agreement.pdfhttps://s3-media0.fl.yelpcdn.com/assets/srv0/engineering_pages/bea5c1e92bf3/assets/vendor/yelp-dataset-agreement.pdf
Description
Dataset containing millions of reviews on Yelp. In addition it contains business data including location data, attributes, and categories.
Yelp Dataset
kaggle.com
zip
Updated Mar 17, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yelp, Inc. (2022). Yelp Dataset [Dataset]. https://www.kaggle.com/yelp-dataset/yelp-dataset
Explore at:
zip(4374983563 bytes)Available download formats
Dataset updated
Mar 17, 2022
Dataset provided by
Yelphttp://yelp.com/
Authors
Yelp, Inc.
Description
Context

This dataset is a subset of Yelp's businesses, reviews, and user data. It was originally put together for the Yelp Dataset Challenge which is a chance for students to conduct research or analysis on Yelp's data and share their discoveries. In the most recent dataset you'll find information about businesses across 8 metropolitan areas in the USA and Canada.

Content

This dataset contains five JSON files and the user agreement. More information about those files can be found here.

Code snippet to read the files

in Python, you can read the JSON files like this (using the json and pandas libraries):

import json import pandas as pd data_file = open("yelp_academic_dataset_checkin.json") data = [] for line in data_file: data.append(json.loads(line)) checkin_df = pd.DataFrame(data) data_file.close()
T
yelp_polarity_reviews
tensorflow.org
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). yelp_polarity_reviews [Dataset]. https://www.tensorflow.org/datasets/catalog/yelp_polarity_reviews
Explore at:
Description
Large Yelp Review Dataset. This is a dataset for binary sentiment classification. We provide a set of 560,000 highly polar yelp reviews for training, and 38,000 for testing. ORIGIN The Yelp reviews dataset consists of reviews from Yelp. It is extracted from the Yelp Dataset Challenge 2015 data. For more information, please refer to http://www.yelp.com/dataset

The Yelp reviews polarity dataset is constructed by Xiang Zhang (xiang.zhang@nyu.edu) from the above dataset. It is first used as a text classification benchmark in the following paper: Xiang Zhang, Junbo Zhao, Yann LeCun. Character-level Convolutional Networks for Text Classification. Advances in Neural Information Processing Systems 28 (NIPS 2015).

DESCRIPTION

The Yelp reviews polarity dataset is constructed by considering stars 1 and 2 negative, and 3 and 4 positive. For each polarity 280,000 training samples and 19,000 testing samples are take randomly. In total there are 560,000 trainig samples and 38,000 testing samples. Negative polarity is class 1, and positive class 2.

The files train.csv and test.csv contain all the training samples as comma-sparated values. There are 2 columns in them, corresponding to class index (1 and 2) and review text. The review texts are escaped using double quotes ("), and any internal double quote is escaped by 2 double quotes (""). New lines are escaped by a backslash followed with an "n" character, that is " ".

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('yelp_polarity_reviews', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.
data extracted from Yelp Open Dataset
figshare.com
txt
Updated Jun 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sven Gedicke (2023). data extracted from Yelp Open Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.20318538.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.20318538.v1
Dataset updated
Jun 16, 2023
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Sven Gedicke
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The different data sets of point features related to food that we used for the different tasks in our user study. The data was extracted from the Yelp Open Dataset.
h
yelp-open-dataset-checkin
huggingface.co
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yash Raizada, yelp-open-dataset-checkin [Dataset]. https://huggingface.co/datasets/yashraizada/yelp-open-dataset-checkin
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Authors
Yash Raizada
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
yashraizada/yelp-open-dataset-checkin dataset hosted on Hugging Face and contributed by the HF Datasets community
Z
The Yelp Collaborative Knowledge Graph
data.niaid.nih.gov
zenodo.org
Updated Jun 17, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Heede, Thomas (2023). The Yelp Collaborative Knowledge Graph [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7878446
Explore at:
Dataset updated
Jun 17, 2023
Dataset provided by
Corfixen, Mads
Olesen, Magnus
Nielsen, Christian Filip Pinderup
Heede, Thomas
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the The Yelp Collaborative Knowledge Graph (YCKG) - a transformation of the Yelp Open Dataset into RDF format using Y2KG.

Paper Abstract

The Yelp Open Dataset (YOD) contains data about businesses, reviews, and users from the Yelp website and is available for research purposes. This dataset has been widely used to develop and test Recommender Systems (RS), especially those using Knowledge Graphs (KGs), e.g., integrating taxonomies, product categories, business locations, and social network information. Unfortunately, researchers applied naive or wrong mappings while converting YOD in KGs, consequently obtaining unrealistic results. Among the various issues, the conversion processes usually do not follow state-of-the-art methodologies, fail to properly link to other KGs and reuse existing vocabularies. In this work, we overcome these issues by introducing Y2KG, a utility to convert the Yelp dataset into a KG. Y2KG consists of two components. The first is a dataset including (1) a vocabulary that extends Schema.org with properties to describe the concepts in YOD and (2) mappings between the Yelp entities and Wikidata. The second component is a set of scripts to transform YOD in RDF and obtain the Yelp Collaborative Knowledge Graph (YCKG). The design of Y2KG was driven by 16 core competency questions. YCKG includes 150k businesses and 16.9M reviews from 1.9M distinct real users, resulting in over 244 million triples (with 144 distinct predicates) for about 72 million resources, with an average in-degree and out-degree of 3.3 and 12.2, respectively.

Links

Latest GitHub release: https://github.com/MadsCorfixen/The-Yelp-Collaborative-Knowledge-Graph/releases/latest

PURL domain: https://purl.archive.org/domain/yckg

Files

Graph Data Triple Files

One sample file for each of the Yelp domains (Businesses, Users, Reviews, Tips and Checkins), each containing 20 entities.

yelp_schema_mappings.nt.gz containing the mappings from Yelp categories to Schema things.

schema_hierarchy.nt.gz containing the full hierarchy of the mapped Schema things.

yelp_wiki_mappings.nt.gz containing the mappings from Yelp categories to Wikidata entities.

wikidata_location_mappings.nt.gz containing the mappings from Yelp locations to Wikidata entities.

Graph Metadata Triple Files

yelp_categories.ttl contains metadata for all Yelp categories.

yelp_entities.ttl contains metadata regarding the dataset

yelp_vocabulary.ttl contains metadata on the created Yelp vocabulary and properties.

Utility Files

yelp_category_schema_mappings.csv. This file contains the 310 mappings from Yelp categories to Schema types. These mappings have been manually verified to be correct.

yelp_predicate_schema_mappings.csv. This file contains the 14 mappings from Yelp attributes to Schema properties. These mappings are manually found.

ground_truth_yelp_category_schema_mappings.csv. This file contains the ground truth, based on 200 manually verified mappings from Yelp categories to Schema things. The ground truth mappings were used to calculate precision and recall for the semantic mappings.

manually_split_categories.csv. This file contains all Yelp categories containing either a & or /, and their manually split versions. The split versions have been used in the semantic mappings to Schema things.
D
SYNERGY - Open machine learning dataset on study selection in systematic...
dataverse.nl
csv, json, txt, zip
Updated Apr 24, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jonathan De Bruin; Jonathan De Bruin; Yongchao Ma; Yongchao Ma; Gerbrich Ferdinands; Gerbrich Ferdinands; Jelle Teijema; Jelle Teijema; Rens Van de Schoot; Rens Van de Schoot (2023). SYNERGY - Open machine learning dataset on study selection in systematic reviews [Dataset]. http://doi.org/10.34894/HE6NAQ
Explore at:
txt(212), json(702), zip(16028323), json(19426), txt(263), zip(3560967), txt(305), json(470), txt(279), zip(2355371), json(23201), csv(460956), txt(200), json(685), json(546), csv(63996), zip(2989015), zip(5749455), txt(331), txt(315), json(691), json(23775), csv(672721), json(468), txt(415), json(22778), csv(31919), csv(746832), json(18392), zip(62992826), csv(234822), txt(283), zip(34788857), json(475), txt(242), json(533), csv(42227), json(24548), zip(738232), json(22477), json(25491), zip(11463283), json(17741), csv(490660), json(19662), json(578), csv(19786), zip(14708207), zip(24619707), zip(2404439), json(713), json(27224), json(679), json(26426), txt(185), json(906), zip(18534723), json(23550), txt(266), txt(317), zip(6019723), json(33943), txt(436), csv(388378), json(469), zip(2106498), txt(320), csv(451336), txt(338), zip(19428163), json(14326), json(31652), txt(299), csv(96153), txt(220), csv(114789), json(15452), csv(5372708), json(908), csv(317928), csv(150923), json(465), csv(535584), json(26090), zip(8164831), json(19633), txt(316), json(23494), csv(133950), json(18638), csv(3944082), json(15345), json(473), zip(4411063), zip(10396095), zip(835096), txt(255), json(699), csv(654705), txt(294), csv(989865), zip(1028035), txt(322), zip(15085090), txt(237), txt(310), json(756), json(30628), json(19490), json(25908), txt(401), json(701), zip(5543909), json(29397), zip(14007470), json(30058), zip(58869042), csv(852937), json(35711), csv(298011), csv(187163), txt(258), zip(3526740), json(568), json(21552), zip(66466788), csv(215250), json(577), csv(103010), txt(306), zip(11840006)Available download formats
Unique identifier
https://doi.org/10.34894/HE6NAQ
Dataset updated
Apr 24, 2023
Dataset provided by
DataverseNL
Authors
Jonathan De Bruin; Jonathan De Bruin; Yongchao Ma; Yongchao Ma; Gerbrich Ferdinands; Gerbrich Ferdinands; Jelle Teijema; Jelle Teijema; Rens Van de Schoot; Rens Van de Schoot
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
SYNERGY is a free and open dataset on study selection in systematic reviews, comprising 169,288 academic works from 26 systematic reviews. Only 2,834 (1.67%) of the academic works in the binary classified dataset are included in the systematic reviews. This makes the SYNERGY dataset a unique dataset for the development of information retrieval algorithms, especially for sparse labels. Due to the many available variables available per record (i.e. titles, abstracts, authors, references, topics), this dataset is useful for researchers in NLP, machine learning, network analysis, and more. In total, the dataset contains 82,668,134 trainable data points. The easiest way to get the SYNERGY dataset is via the synergy-dataset Python package. See https://github.com/asreview/synergy-dataset for all information.
h
yelp-open-dataset-top-businesses
huggingface.co
Updated Jan 25, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yash Raizada (2024). yelp-open-dataset-top-businesses [Dataset]. https://huggingface.co/datasets/yashraizad/yelp-open-dataset-top-businesses
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 25, 2024
Authors
Yash Raizada
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
yashraizad/yelp-open-dataset-top-businesses dataset hosted on Hugging Face and contributed by the HF Datasets community
u
Goodreads Book Reviews
cseweb.ucsd.edu
json
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UCSD CSE Research Project, Goodreads Book Reviews [Dataset]. https://cseweb.ucsd.edu/~jmcauley/datasets.html
Explore at:
jsonAvailable download formats
Dataset authored and provided by
UCSD CSE Research Project
Description
These datasets contain reviews from the Goodreads book review website, and a variety of attributes describing the items. Critically, these datasets have multiple levels of user interaction, raging from adding to a shelf, rating, and reading.

Metadata includes

reviews

add-to-shelf, read, review actions

book attributes: title, isbn

graph of similar books

Basic Statistics:

Items: 1,561,465

Users: 808,749

Interactions: 225,394,930
g
Amazon review data 2018
nijianmo.github.io
cseweb.ucsd.edu
+1more
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UCSD CSE Research Project, Amazon review data 2018 [Dataset]. https://nijianmo.github.io/amazon/
Explore at:
Dataset authored and provided by
UCSD CSE Research Project
Description
Context

This Dataset is an updated version of the Amazon review dataset released in 2014. As in the previous version, this dataset includes reviews (ratings, text, helpfulness votes), product metadata (descriptions, category information, price, brand, and image features), and links (also viewed/also bought graphs). In addition, this version provides the following features:

More reviews:

The total number of reviews is 233.1 million (142.8 million in 2014).

New reviews:

Current data includes reviews in the range May 1996 - Oct 2018.

Metadata: - We have added transaction metadata for each review shown on the review page.

Added more detailed metadata of the product landing page.

Acknowledgements

If you publish articles based on this dataset, please cite the following paper:

Jianmo Ni, Jiacheng Li, Julian McAuley. Justifying recommendations using distantly-labeled reviews and fined-grained aspects. EMNLP, 2019.
u
Pinterest Fashion Compatibility
cseweb.ucsd.edu
beta.data.urbandatacentre.ca
json
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UCSD CSE Research Project, Pinterest Fashion Compatibility [Dataset]. https://cseweb.ucsd.edu/~jmcauley/datasets.html
Explore at:
jsonAvailable download formats
Dataset authored and provided by
UCSD CSE Research Project
Description
This dataset contains images (scenes) containing fashion products, which are labeled with bounding boxes and links to the corresponding products.

Metadata includes

product IDs

bounding boxes

Basic Statistics:

Scenes: 47,739

Products: 38,111

Scene-Product Pairs: 93,274
O
Food Inspection - LIVES standard
data.montgomerycountymd.gov
s.cnmilf.com
+2more
application/rdfxml +5
Updated Mar 1, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2017). Food Inspection - LIVES standard [Dataset]. https://data.montgomerycountymd.gov/Health-and-Human-Services/Food-Inspection-LIVES-standard/ft84-r7wr
Explore at:
csv, application/rdfxml, application/rssxml, xml, tsv, jsonAvailable download formats
Dataset updated
Mar 1, 2017
Description
Current food Inspection dataset published using LIVES data standard.
H
Food Inspection Violations in Boston, MA
dataverse.harvard.edu
Updated Aug 27, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bidisha Das; Alina Ristea; Daniel T. O'Brien (2020). Food Inspection Violations in Boston, MA [Dataset]. http://doi.org/10.7910/DVN/6MUQKX
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/6MUQKX
Dataset updated
Aug 27, 2020
Dataset provided by
Harvard Dataverse
Authors
Bidisha Das; Alina Ristea; Daniel T. O'Brien
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
Massachusetts, Boston
Description
These datasets include Food Establishment Inspections processed from the city’s open data initiative (data.boston.gov), and the connection with data scraped from Yelp (namely, Yelp reviews) scraped by BARI. The data is within the city of Boston. The Food Inspections dataset is released by the Health Division of the Department of Inspectional Services of Boston which ensures that all food establishments in the City of Boston meet relevant sanitary codes and standards. The data scraped from Yelp pages includes information about restaurants in Boston that were reviewed on yelp.com. Thus, the data includes two files: Food.Inspections.Records.csv contains information about food inspections at record level (i.e. each record for each restaurant is included). Food.Inspections.Yelp.Restaurant.csv contains information about food inspections at the restaurant level plus information from Yelp reviews also at the restaurant level.
h
yelp-open-dataset-top-users
huggingface.co
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yash Raizada, yelp-open-dataset-top-users [Dataset]. https://huggingface.co/datasets/yashraizad/yelp-open-dataset-top-users
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Authors
Yash Raizada
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
yashraizad/yelp-open-dataset-top-users dataset hosted on Hugging Face and contributed by the HF Datasets community
ScrapeHero Data Cloud - Free and Easy to use
datarade.ai
.json, .csv
Updated Apr 11, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Scrapehero (2022). ScrapeHero Data Cloud - Free and Easy to use [Dataset]. https://datarade.ai/data-products/scrapehero-data-cloud-free-and-easy-to-use-scrapehero
Explore at:
.json, .csvAvailable download formats
Dataset updated
Apr 11, 2022
Dataset provided by
ScrapeHero
Authors
Scrapehero
Area covered
Bhutan, Dominica, Bahamas, Slovakia, Anguilla, Ghana, Portugal, Niue, Chad, Bahrain
Description
The Easiest Way to Collect Data from the Internet Download anything you see on the internet into spreadsheets within a few clicks using our ready-made web crawlers or a few lines of code using our APIs

We have made it as simple as possible to collect data from websites

Easy to Use Crawlers Amazon Product Details and Pricing Scraper Amazon Product Details and Pricing Scraper Get product information, pricing, FBA, best seller rank, and much more from Amazon.

Google Maps Search Results Google Maps Search Results Get details like place name, phone number, address, website, ratings, and open hours from Google Maps or Google Places search results.

Twitter Scraper Twitter Scraper Get tweets, Twitter handle, content, number of replies, number of retweets, and more. All you need to provide is a URL to a profile, hashtag, or an advance search URL from Twitter.

Amazon Product Reviews and Ratings Amazon Product Reviews and Ratings Get customer reviews for any product on Amazon and get details like product name, brand, reviews and ratings, and more from Amazon.

Google Reviews Scraper Google Reviews Scraper Scrape Google reviews and get details like business or location name, address, review, ratings, and more for business and places.

Walmart Product Details & Pricing Walmart Product Details & Pricing Get the product name, pricing, number of ratings, reviews, product images, URL other product-related data from Walmart.

Amazon Search Results Scraper Amazon Search Results Scraper Get product search rank, pricing, availability, best seller rank, and much more from Amazon.

Amazon Best Sellers Amazon Best Sellers Get the bestseller rank, product name, pricing, number of ratings, rating, product images, and more from any Amazon Bestseller List.

Google Search Scraper Google Search Scraper Scrape Google search results and get details like search rank, paid and organic results, knowledge graph, related search results, and more.

Walmart Product Reviews & Ratings Walmart Product Reviews & Ratings Get customer reviews for any product on Walmart.com and get details like product name, brand, reviews, and ratings.

Scrape Emails and Contact Details Scrape Emails and Contact Details Get emails, addresses, contact numbers, social media links from any website.

Walmart Search Results Scraper Walmart Search Results Scraper Get Product details such as pricing, availability, reviews, ratings, and more from Walmart search results and categories.

Glassdoor Job Listings Glassdoor Job Listings Scrape job details such as job title, salary, job description, location, company name, number of reviews, and ratings from Glassdoor.

Indeed Job Listings Indeed Job Listings Scrape job details such as job title, salary, job description, location, company name, number of reviews, and ratings from Indeed.

LinkedIn Jobs Scraper Premium LinkedIn Jobs Scraper Scrape job listings on LinkedIn and extract job details such as job title, job description, location, company name, number of reviews, and more.

Redfin Scraper Premium Redfin Scraper Scrape real estate listings from Redfin. Extract property details such as address, price, mortgage, redfin estimate, broker name and more.

Yelp Business Details Scraper Yelp Business Details Scraper Scrape business details from Yelp such as phone number, address, website, and more from Yelp search and business details page.

Zillow Scraper Premium Zillow Scraper Scrape real estate listings from Zillow. Extract property details such as address, price, Broker, broker name and more.

Amazon product offers and third party sellers Amazon product offers and third party sellers Get product pricing, delivery details, FBA, seller details, and much more from the Amazon offer listing page.

Realtor Scraper Premium Realtor Scraper Scrape real estate listings from Realtor.com. Extract property details such as Address, Price, Area, Broker and more.

Target Product Details & Pricing Target Product Details & Pricing Get product details from search results and category pages such as pricing, availability, rating, reviews, and 20+ data points from Target.

Trulia Scraper Premium Trulia Scraper Scrape real estate listings from Trulia. Extract property details such as Address, Price, Area, Mortgage and more.

Amazon Customer FAQs Amazon Customer FAQs Get FAQs for any product on Amazon and get details like the question, answer, answered user name, and more.

Yellow Pages Scraper Yellow Pages Scraper Get details like business name, phone number, address, website, ratings, and more from Yellow Pages search results.
u
PDMX
cseweb.ucsd.edu
json
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UCSD CSE Research Project, PDMX [Dataset]. https://cseweb.ucsd.edu/~jmcauley/datasets.html
Explore at:
jsonAvailable download formats
Dataset authored and provided by
UCSD CSE Research Project
Description
We introduce PDMX: a Public Domain MusicXML dataset for symbolic music processing, including over 250k musical scores in MusicXML format. PDMX is the largest publicly available, copyright-free MusicXML dataset in existence. PDMX includes genre, tag, description, and popularity metadata for every file.
O
Food inspection fails 2015-2016
data.montgomerycountymd.gov
data.wu.ac.at
Updated Mar 1, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2017). Food inspection fails 2015-2016 [Dataset]. https://data.montgomerycountymd.gov/w/vztm-qm9u/tdqt-sri3?cur=-2i5id2emQs&from=lIGex4_mSCs
Explore at:
xml, csv, application/rdfxml, application/geo+json, tsv, kml, application/rssxml, kmzAvailable download formats
Dataset updated
Mar 1, 2017
Description
Current food Inspection dataset published using LIVES data standard.
Geotagged Digital Traces
zenodo.org
ekoizpen-zientifikoa.ehu.eus
+1more
application/gzip
Updated May 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ricardo Munoz-Cancino; Ricardo Munoz-Cancino; Sebastián A. Ríos; Manuel Graña; Manuel Graña; Sebastián A. Ríos (2023). Geotagged Digital Traces [Dataset]. http://doi.org/10.5281/zenodo.7949307
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7949307
Dataset updated
May 19, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Ricardo Munoz-Cancino; Ricardo Munoz-Cancino; Sebastián A. Ríos; Manuel Graña; Manuel Graña; Sebastián A. Ríos
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset, divided into files by city, contains geotagged digital traces collected from different social media platforms, detailed below.

• Tweets - Cheng et al. [1]

• Gowalla [2]

• Tweets - Lamsal [3]

• YELP[4]

• Tweets - Kejriwal et al. [5]

• Geotagged Tweets [6]

• UrbanActivity, [7]

• Brightkite [8]

• Weeplaces [8]

• Flickr [9]

• Foursquare [10]

Each file is named according to the city to which the digital traces were associated and contains the columns:

Source: contains the name of the source platform

Event_date: contains the date associated with the digital trace

Lat: latitude of the digital trace

Lng: length of the digital trace

The definition of city/town used is provided by Simplemaps [11], which considers a city/town any inhabited place as determined by U.S. government agencies. The location of cities and their respective centers were obtained from the World Cities Database provided by the same company.

A specific group of these cities was utilized for the research presented in the article submitted to Sensors Journal:

Muñoz-Cancino, R., Rios, S. A., & Graña, M. (2023). Clustering cities over features extracted from multiple virtual sensors measuring micro-level activity patterns allows to discriminate large-scale city characteristics. Sensors, Under Review.

Comprehensive guidelines and the selection criteria can be found in the abovementioned article.

References

[1] Zhiyuan Cheng, James Caverlee, and Kyumin Lee. You are where you tweet: A content-based approach to geo-locating twitter users. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management, CIKM '10, page 759{768, New York, NY, USA, 2010. Association for Computing Machinery.
[2] Eunjoon Cho, Seth A. Myers, and Jure Leskovec. Friendship and mobility: User movement in location-based social networks. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '11, page 1082{1090, New York, NY, USA, 2011. Association for Computing Machinery.
[3] Yunhe Feng and Wenjun Zhou. Is working from home the new norm? an observational study based on a large geo-tagged covid-19 twitter dataset, 2020.
[4] Yelp Inc. Yelp Open Dataset, 2021. Retrieved from https://www.yelp.com/dataset. Accessed October 26, 2021.
[5] Mayank Kejriwal and Sara Melotte. A Geo-Tagged COVID-19 Twitter Dataset for 10 North American Metropolitan Areas, January 2021.
[6] Rabindra Lamsal. Design and analysis of a large-scale covid-19 tweets dataset. Applied Intelligence, 51(5):2790{2804, 2021.
[7] Geraud Le Falher, Aristides Gionis, and Michael Mathioudakis. Where is the Soho of Rome? Measures and algorithms for finding similar neighborhoods in cities. In 9th AAAI Conference on Web and Social Media - ICWSM 2015, Oxford, United Kingdom, May 2015.
[8] Yong Liu, WeiWei, Aixin Sun, and Chunyan Miao. Exploiting geographical neighborhood characteristics for location recommendation. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, CIKM '14, page 739{748, New York, NY,USA, 2014. Association for Computing Machinery.
[9] Hatem Mousselly-Sergieh, Daniel Watzinger, Bastian Huber, Mario Doller, Elood Egyed-Zsigmond, and Harald Kosch. World-wide scale geotagged image dataset for automatic image annotation and reverse geotagging. In Proceedings of the 5th ACM Multimedia Systems Conference, MMSys '14, page 47{52, New York, NY, USA, 2014. Association for Computing Machinery.
[10] Dingqi Yang, Daqing Zhang, Vincent W. Zheng, and Zhiyong Yu. Modeling user activity preference by leveraging user spatial temporal characteristics in lbsns. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 45(1):129{142, 2015.
[11] Simple Maps. Basic World Cities Database, 2021. Retrieved from https://simplemaps.com/data/world-cities. Accessed September 3, 2021.

Facebook

Twitter

Click to copy link

Link copied

Cite

Yelp (2012). yelp_review_full [Dataset]. https://huggingface.co/datasets/Yelp/yelp_review_full

yelp_review_full

YelpReviewFull

Yelp/yelp_review_full

Explore at:

63 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Mar 6, 2012

Dataset authored and provided by

Yelphttp://yelp.com/

License

https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

Description

Dataset Card for YelpReviewFull

  Dataset Summary

The Yelp reviews dataset consists of reviews from Yelp. It is extracted from the Yelp Dataset Challenge 2015 data.

  Supported Tasks and Leaderboards

text-classification, sentiment-classification: The dataset is mainly used for text classification: given the text, predict the sentiment.

  Languages

The reviews were mainly written in english.

  Dataset Structure





  Data Instances

A… See the full description on the dataset page: https://huggingface.co/datasets/Yelp/yelp_review_full.

Clear search

Close search

Google apps

Main menu

yelp_review_full

Yelp dataset

Dataset

Contents

Yelp Open Dataset

Yelp Dataset

Context

Content

Code snippet to read the files

yelp_polarity_reviews

data extracted from Yelp Open Dataset

yelp-open-dataset-checkin

The Yelp Collaborative Knowledge Graph

SYNERGY - Open machine learning dataset on study selection in systematic...

yelp-open-dataset-top-businesses

Goodreads Book Reviews

Amazon review data 2018

Context

Acknowledgements

Pinterest Fashion Compatibility

Food Inspection - LIVES standard

Food Inspection Violations in Boston, MA

yelp-open-dataset-top-users

ScrapeHero Data Cloud - Free and Easy to use

PDMX

Food inspection fails 2015-2016

Geotagged Digital Traces

yelp_review_full

YelpReviewFull

Yelp/yelp_review_full