55 datasets found
  1. T

    yelp_polarity_reviews

    • tensorflow.org
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    yelp_polarity_reviews [Dataset]. https://www.tensorflow.org/datasets/catalog/yelp_polarity_reviews
    Explore at:
    Description

    Large Yelp Review Dataset. This is a dataset for binary sentiment classification. We provide a set of 560,000 highly polar yelp reviews for training, and 38,000 for testing. ORIGIN The Yelp reviews dataset consists of reviews from Yelp. It is extracted from the Yelp Dataset Challenge 2015 data. For more information, please refer to http://www.yelp.com/dataset

    The Yelp reviews polarity dataset is constructed by Xiang Zhang (xiang.zhang@nyu.edu) from the above dataset. It is first used as a text classification benchmark in the following paper: Xiang Zhang, Junbo Zhao, Yann LeCun. Character-level Convolutional Networks for Text Classification. Advances in Neural Information Processing Systems 28 (NIPS 2015).

    DESCRIPTION

    The Yelp reviews polarity dataset is constructed by considering stars 1 and 2 negative, and 3 and 4 positive. For each polarity 280,000 training samples and 19,000 testing samples are take randomly. In total there are 560,000 trainig samples and 38,000 testing samples. Negative polarity is class 1, and positive class 2.

    The files train.csv and test.csv contain all the training samples as comma-sparated values. There are 2 columns in them, corresponding to class index (1 and 2) and review text. The review texts are escaped using double quotes ("), and any internal double quote is escaped by 2 double quotes (""). New lines are escaped by a backslash followed with an "n" character, that is " ".

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('yelp_polarity_reviews', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

  2. Cumulative number of reviews submitted to Yelp 2009-2022

    • statista.com
    Updated Apr 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Cumulative number of reviews submitted to Yelp 2009-2022 [Dataset]. https://www.statista.com/statistics/278032/cumulative-number-of-reviews-submitted-to-yelp/
    Explore at:
    Dataset updated
    Apr 4, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    At the end of 2021, a total of 244 million reviews had been submitted to the local business review and recommendation site Yelp, representing a nine percent year-on-year increase from the 224 million reviews at the end of the previous year.

  3. yelp_review_full

    • huggingface.co
    Updated Mar 6, 2012
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yelp (2012). yelp_review_full [Dataset]. https://huggingface.co/datasets/Yelp/yelp_review_full
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 6, 2012
    Dataset authored and provided by
    Yelphttp://yelp.com/
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    Dataset Card for YelpReviewFull

      Dataset Summary
    

    The Yelp reviews dataset consists of reviews from Yelp. It is extracted from the Yelp Dataset Challenge 2015 data.

      Supported Tasks and Leaderboards
    

    text-classification, sentiment-classification: The dataset is mainly used for text classification: given the text, predict the sentiment.

      Languages
    

    The reviews were mainly written in english.

      Dataset Structure
    
    
    
    
    
      Data Instances
    

    A… See the full description on the dataset page: https://huggingface.co/datasets/Yelp/yelp_review_full.

  4. Yelp: monthly visitors 2019-2024, by device

    • statista.com
    Updated Jun 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Yelp: monthly visitors 2019-2024, by device [Dataset]. https://www.statista.com/statistics/1326159/number-of-monthly-visitors-to-yelp-by-device/
    Explore at:
    Dataset updated
    Jun 2, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    In 2024, Yelp had a total of ***** million monthly mobile web visitors, and over ** million monthly desktop visitors. Almost ** million visitors accessed Yelp via the mobile app. Mobile web visits were at their highest in 2019, with over ** million visitors accessing the site via desktop per month.

  5. Yelp dataset 2024

    • kaggle.com
    Updated Oct 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    snax07 (2024). Yelp dataset 2024 [Dataset]. https://www.kaggle.com/datasets/snax07/yelp-dataset-2024
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 29, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    snax07
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Yelp Dataset JSON Each file is composed of a single object type, one JSON-object per-line.

    Take a look at some examples to get you started: https://github.com/Yelp/dataset-examples.

    Note: the follow examples contain inline comments, which are technically not valid JSON. This is done here to simplify the documentation and explaining the structure, the JSON files you download will not contain any comments and will be fully valid JSON.

    business.json Contains business data including location data, attributes, and categories.

    { // string, 22 character unique string business id "business_id": "tnhfDv5Il8EaGSXZGiuQGg",

    // string, the business's name
    "name": "Garaje",
    
    // string, the full address of the business
    "address": "475 3rd St",
    
    // string, the city
    "city": "San Francisco",
    
    // string, 2 character state code, if applicable
    "state": "CA",
    
    // string, the postal code
    "postal code": "94107",
    
    // float, latitude
    "latitude": 37.7817529521,
    
    // float, longitude
    "longitude": -122.39612197,
    
    // float, star rating, rounded to half-stars
    "stars": 4.5,
    
    // integer, number of reviews
    "review_count": 1198,
    
    // integer, 0 or 1 for closed or open, respectively
    "is_open": 1,
    
    // object, business attributes to values. note: some attribute values might be objects
    "attributes": {
      "RestaurantsTakeOut": true,
      "BusinessParking": {
        "garage": false,
        "street": true,
        "validated": false,
        "lot": false,
        "valet": false
      },
    },
    
    // an array of strings of business categories
    "categories": [
      "Mexican",
      "Burgers",
      "Gastropubs"
    ],
    
    // an object of key day to value hours, hours are using a 24hr clock
    "hours": {
      "Monday": "10:00-21:00",
      "Tuesday": "10:00-21:00",
      "Friday": "10:00-21:00",
      "Wednesday": "10:00-21:00",
      "Thursday": "10:00-21:00",
      "Sunday": "11:00-18:00",
      "Saturday": "10:00-21:00"
    }
    

    } review.json Contains full review text data including the user_id that wrote the review and the business_id the review is written for.

    { // string, 22 character unique review id "review_id": "zdSx_SD6obEhz9VrW9uAWA",

    // string, 22 character unique user id, maps to the user in user.json
    "user_id": "Ha3iJu77CxlrFm-vQRs_8g",
    
    // string, 22 character business id, maps to business in business.json
    "business_id": "tnhfDv5Il8EaGSXZGiuQGg",
    
    // integer, star rating
    "stars": 4,
    
    // string, date formatted YYYY-MM-DD
    "date": "2016-03-09",
    
    // string, the review itself
    "text": "Great place to hang out after work: the prices are decent, and the ambience is fun. It's a bit loud, but very lively. The staff is friendly, and the food is good. They have a good selection of drinks.",
    
    // integer, number of useful votes received
    "useful": 0,
    
    // integer, number of funny votes received
    "funny": 0,
    
    // integer, number of cool votes received
    "cool": 0
    

    } user.json User data including the user's friend mapping and all the metadata associated with the user.

    { // string, 22 character unique user id, maps to the user in user.json "user_id": "Ha3iJu77CxlrFm-vQRs_8g",

    // string, the user's first name
    "name": "Sebastien",
    
    // integer, the number of reviews they've written
    "review_count": 56,
    
    // string, when the user joined Yelp, formatted like YYYY-MM-DD
    "yelping_since": "2011-01-01",
    
    // array of strings, an array of the user's friend as user_ids
    "friends": [
      "wqoXYLWmpkEH0YvTmHBsJQ",
      "KUXLLiJGrjtSsapmxmpvTA",
      "6e9rJKQC3n0RSKyHLViL-Q"
    ],
    
    // integer, number of useful votes sent by the user
    "useful": 21,
    
    // integer, number of funny votes sent by the user
    "funny": 88,
    
    // integer, number of cool votes sent by the user
    "cool": 15,
    
    // integer, number of fans the user has
    "fans": 1032,
    
    // array of integers, the years the user was elite
    "elite": [
      2012,
      2013
    ],
    
    // float, average rating of all reviews
    "average_stars": 4.31,
    
    // integer, number of hot compliments received by the user
    "compliment_hot": 339,
    
    // integer, number of more compliments received by the user
    "compliment_more": 668,
    
    // integer, number of profile compliments received by the user
    "compliment_profile": 42,
    
    // integer, number of cute compliments received by the user
    "compliment_cute": 62,
    
    // integer, number of list compliments received by the user
    "compliment_list": 37,
    
    // integer, number of note compliments received by the user
    "compliment_note": 356,
    
    // integer, number of plain compliments received by the user
    "compliment_plain": 68,
    
    // integer, number of coo...
    
  6. S

    Yelp Statistics By Users, Demographics, Revenue and Facts (2025)

    • sci-tech-today.com
    Updated May 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sci-Tech Today (2025). Yelp Statistics By Users, Demographics, Revenue and Facts (2025) [Dataset]. https://www.sci-tech-today.com/stats/yelp-statistics-updated/
    Explore at:
    Dataset updated
    May 2, 2025
    Dataset authored and provided by
    Sci-Tech Today
    License

    https://www.sci-tech-today.com/privacy-policyhttps://www.sci-tech-today.com/privacy-policy

    Time period covered
    2022 - 2032
    Area covered
    Global
    Description

    Introduction

    Yelp Statistics: Yelp is a popular online platform that helps users find local businesses based on reviews and ratings from other customers. In 2024, Yelp continues to hold a significant presence, particularly in the U.S., where most of its traffic and revenue are generated. Yelp offers both a website and a mobile app, making it easy for people to access business information and read reviews.

    Businesses can also advertise on Yelp, using the platform to reach new customers and manage their online reputation. With its comprehensive review system, Yelp plays a vital role in connecting people with trusted local businesses. This article will help you understand Yelp's key statistics and trends, providing valuable insights for businesses aiming to optimize their presence on the platform.

  7. Yelp: number of unique mobile visitors 2016-2021

    • statista.com
    Updated Feb 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2023). Yelp: number of unique mobile visitors 2016-2021 [Dataset]. https://www.statista.com/statistics/385440/unique-mobile-visitors-yelp/
    Explore at:
    Dataset updated
    Feb 28, 2023
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    The timeline shows the number of unique mobile visitors to recommendation platform Yelp from 2016 to 2021, per quarter. The local search and review site's mobile visitor numbers have displayed a steady growth, reaching 31 million unique mobile app devices in the first quarter of 2021.

  8. Yelp Business Reviews

    • openwebninja.com
    json
    Updated Jul 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    OpenWeb Ninja (2024). Yelp Business Reviews [Dataset]. https://www.openwebninja.com/api/yelp-business-data
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Jul 22, 2024
    Dataset authored and provided by
    OpenWeb Ninja
    Area covered
    Global
    Description

    This dataset provides comprehensive business information and reviews from Yelp. It includes detailed business data, customer reviews, ratings, and search capabilities for local businesses and restaurants. Perfect for applications requiring local business intelligence and customer feedback analysis. The dataset is delivered in a JSON format via REST API.

  9. Datasets used in the study: TripAdvisor and Yelp review data, tweets related...

    • figshare.com
    zip
    Updated May 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    INNOCENSIA OWUOR (2023). Datasets used in the study: TripAdvisor and Yelp review data, tweets related to points of interest in Florida and New York. [Dataset]. http://doi.org/10.6084/m9.figshare.22766654.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 4, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    INNOCENSIA OWUOR
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Florida
    Description

    Contains TripAdvisor and Yelp review data, and tweets related to points of interest in Florida and New York. twitter, yelp, Florida, New York, data mining

  10. Yelp Open Dataset

    • live.european-language-grid.eu
    json
    Updated Dec 30, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yelp (2015). Yelp Open Dataset [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/5179
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Dec 30, 2015
    Dataset authored and provided by
    Yelphttp://yelp.com/
    License

    https://s3-media0.fl.yelpcdn.com/assets/srv0/engineering_pages/bea5c1e92bf3/assets/vendor/yelp-dataset-agreement.pdfhttps://s3-media0.fl.yelpcdn.com/assets/srv0/engineering_pages/bea5c1e92bf3/assets/vendor/yelp-dataset-agreement.pdf

    Description

    Dataset containing millions of reviews on Yelp. In addition it contains business data including location data, attributes, and categories.

  11. Yelp quarterly net ad revenue share 2020-2025, by category

    • statista.com
    Updated May 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Yelp quarterly net ad revenue share 2020-2025, by category [Dataset]. https://www.statista.com/statistics/1326146/yelp-advertising-revenue-from-quarterly-by-category/
    Explore at:
    Dataset updated
    May 21, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    American online directory Yelp generated over *** million U.S. dollars in advertising revenue from service businesses in the first quarter of 2025, which include home, local, professional, and pet businesses, amongst other companies. Overall, *** million U.S. dollars were generated via restaurants, retail, and other businesses, such as fitness, and health and beauty.

  12. Yelp annual net income 2007-2024

    • statista.com
    Updated Jun 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Yelp annual net income 2007-2024 [Dataset]. https://www.statista.com/statistics/278031/yelps-annual-net-loss-and-income/
    Explore at:
    Dataset updated
    Jun 3, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    In 2024, the local review and search site Yelp's net income amounted to *** million US dollars, up from a net income of ** million U.S. dollars in 2023. In 2020, Yelp's business was strongly impacted by shifts in ad budgets due to the global coronavirus outbreak, but the company has since recovered to higher annual net income levels than in 2019. In 2024, Yelp generated over *********** U.S. dollars in annual net revenue.

  13. Yelp: quarterly net revenue share 2010-2020, by vertical

    • statista.com
    Updated Feb 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2023). Yelp: quarterly net revenue share 2010-2020, by vertical [Dataset]. https://www.statista.com/statistics/478578/yelps-quarterly-net-revenue-share-by-vertical/
    Explore at:
    Dataset updated
    Feb 28, 2023
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    The timeline shows the distribution of Yelp's quarterly net revenues, sorted by vertical. In the fourth quarter of 2020, 11 percent of advertising revenue was generated through the restaurant vertical. Home & local accounted for the biggest share with 44 percent.

  14. P

    Yelp-Fraud Dataset

    • paperswithcode.com
    Updated Apr 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yingtong Dou; Zhiwei Liu; Li Sun; Yutong Deng; Hao Peng; Philip S. Yu (2025). Yelp-Fraud Dataset [Dataset]. https://paperswithcode.com/dataset/yelpchi
    Explore at:
    Dataset updated
    Apr 21, 2025
    Authors
    Yingtong Dou; Zhiwei Liu; Li Sun; Yutong Deng; Hao Peng; Philip S. Yu
    Description

    Yelp-Fraud is a multi-relational graph dataset built upon the Yelp spam review dataset, which can be used in evaluating graph-based node classification, fraud detection, and anomaly detection models.

    Dataset Statistics

    # Nodes%Fraud Nodes (Class=1)
    45,95414.5
    Relation# Edges
    R-U-R
    R-T-R
    R-S-R3,402,743
    All

    Graph Construction

    The Yelp spam review dataset includes hotel and restaurant reviews filtered (spam) and recommended (legitimate) by Yelp. We conduct a spam review detection task on the Yelp-Fraud dataset which is a binary classification task. We take 32 handcrafted features from SpEagle paper as the raw node features for Yelp-Fraud. Based on previous studies which show that opinion fraudsters have connections in user, product, review text, and time, we take reviews as nodes in the graph and design three relations: 1) R-U-R: it connects reviews posted by the same user; 2) R-S-R: it connects reviews under the same product with the same star rating (1-5 stars); 3) R-T-R: it connects two reviews under the same product posted in the same month.

    To download the dataset, please visit this Github repo. For any other questions, please email ytongdou(AT)gmail.com for inquiry.

  15. Yelp Dataset

    • kaggle.com
    zip
    Updated Mar 17, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yelp, Inc. (2022). Yelp Dataset [Dataset]. https://www.kaggle.com/yelp-dataset/yelp-dataset
    Explore at:
    zip(4374983563 bytes)Available download formats
    Dataset updated
    Mar 17, 2022
    Dataset provided by
    Yelphttp://yelp.com/
    Authors
    Yelp, Inc.
    Description

    Context

    This dataset is a subset of Yelp's businesses, reviews, and user data. It was originally put together for the Yelp Dataset Challenge which is a chance for students to conduct research or analysis on Yelp's data and share their discoveries. In the most recent dataset you'll find information about businesses across 8 metropolitan areas in the USA and Canada.

    Content

    This dataset contains five JSON files and the user agreement. More information about those files can be found here.

    Code snippet to read the files

    in Python, you can read the JSON files like this (using the json and pandas libraries):

    import json
    import pandas as pd
    data_file = open("yelp_academic_dataset_checkin.json")
    data = []
    for line in data_file:
     data.append(json.loads(line))
    checkin_df = pd.DataFrame(data)
    data_file.close()
    
    
  16. Yelp annual net revenue 2007-2024

    • statista.com
    Updated Jun 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Yelp annual net revenue 2007-2024 [Dataset]. https://www.statista.com/statistics/278022/yelps-annual-net-revenue/
    Explore at:
    Dataset updated
    Jun 2, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    In 2024, the revenue generated by local business review platform Yelp amounted to over *** billion U.S. dollars, up from **** billion U.S. dollars in the previous year. Overall, 2020 saw the company's first decline in annual net revenue, seeing a ** percent decrease from 2019.

  17. Z

    The Yelp Collaborative Knowledge Graph

    • data.niaid.nih.gov
    Updated Jun 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nielsen, Christian Filip Pinderup (2023). The Yelp Collaborative Knowledge Graph [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7878446
    Explore at:
    Dataset updated
    Jun 17, 2023
    Dataset provided by
    Corfixen, Mads
    Olesen, Magnus
    Heede, Thomas
    Nielsen, Christian Filip Pinderup
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is the The Yelp Collaborative Knowledge Graph (YCKG) - a transformation of the Yelp Open Dataset into RDF format using Y2KG.

    Paper Abstract

    The Yelp Open Dataset (YOD) contains data about businesses, reviews, and users from the Yelp website and is available for research purposes. This dataset has been widely used to develop and test Recommender Systems (RS), especially those using Knowledge Graphs (KGs), e.g., integrating taxonomies, product categories, business locations, and social network information. Unfortunately, researchers applied naive or wrong mappings while converting YOD in KGs, consequently obtaining unrealistic results. Among the various issues, the conversion processes usually do not follow state-of-the-art methodologies, fail to properly link to other KGs and reuse existing vocabularies. In this work, we overcome these issues by introducing Y2KG, a utility to convert the Yelp dataset into a KG. Y2KG consists of two components. The first is a dataset including (1) a vocabulary that extends Schema.org with properties to describe the concepts in YOD and (2) mappings between the Yelp entities and Wikidata. The second component is a set of scripts to transform YOD in RDF and obtain the Yelp Collaborative Knowledge Graph (YCKG). The design of Y2KG was driven by 16 core competency questions. YCKG includes 150k businesses and 16.9M reviews from 1.9M distinct real users, resulting in over 244 million triples (with 144 distinct predicates) for about 72 million resources, with an average in-degree and out-degree of 3.3 and 12.2, respectively.

    Links

    Latest GitHub release: https://github.com/MadsCorfixen/The-Yelp-Collaborative-Knowledge-Graph/releases/latest

    PURL domain: https://purl.archive.org/domain/yckg

    Files

    Graph Data Triple Files

    One sample file for each of the Yelp domains (Businesses, Users, Reviews, Tips and Checkins), each containing 20 entities.

    yelp_schema_mappings.nt.gz containing the mappings from Yelp categories to Schema things.

    schema_hierarchy.nt.gz containing the full hierarchy of the mapped Schema things.

    yelp_wiki_mappings.nt.gz containing the mappings from Yelp categories to Wikidata entities.

    wikidata_location_mappings.nt.gz containing the mappings from Yelp locations to Wikidata entities.

    Graph Metadata Triple Files

    yelp_categories.ttl contains metadata for all Yelp categories.

    yelp_entities.ttl contains metadata regarding the dataset

    yelp_vocabulary.ttl contains metadata on the created Yelp vocabulary and properties.

    Utility Files

    yelp_category_schema_mappings.csv. This file contains the 310 mappings from Yelp categories to Schema types. These mappings have been manually verified to be correct.

    yelp_predicate_schema_mappings.csv. This file contains the 14 mappings from Yelp attributes to Schema properties. These mappings are manually found.

    ground_truth_yelp_category_schema_mappings.csv. This file contains the ground truth, based on 200 manually verified mappings from Yelp categories to Schema things. The ground truth mappings were used to calculate precision and recall for the semantic mappings.

    manually_split_categories.csv. This file contains all Yelp categories containing either a & or /, and their manually split versions. The split versions have been used in the semantic mappings to Schema things.

  18. yelp_dataset

    • kaggle.com
    zip
    Updated Apr 9, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sahil Bajaj (2024). yelp_dataset [Dataset]. https://www.kaggle.com/datasets/sahilnbajaj/yelp-dataset
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Apr 9, 2024
    Authors
    Sahil Bajaj
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This dataset is a subset of Yelp's businesses, reviews, and user data. It was originally put together for the Yelp Dataset Challenge which is a chance for students to conduct research or analysis on Yelp's data and share their discoveries. In the most recent dataset you'll find information about businesses across 8 metropolitan areas in the USA and Canada.

  19. Pickled Yelp Data

    • figshare.com
    bin
    Updated May 13, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bovey Wu (2017). Pickled Yelp Data [Dataset]. http://doi.org/10.6084/m9.figshare.5002055.v2
    Explore at:
    binAvailable download formats
    Dataset updated
    May 13, 2017
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Bovey Wu
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    pickled file of reviews from Yelp based on 'gym' search at county level; one file is the condensed version that does not include individual facilities within counties.

  20. Same Sentiment Classification Train/Dev/Test Pair IDs

    • zenodo.org
    • explore.openaire.eu
    • +1more
    zip
    Updated Jun 14, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Erik Körner; Erik Körner; Ahmad Dawar Hakimi; Ahmad Dawar Hakimi; Gerhard Heyer; Gerhard Heyer; Martin Potthast; Martin Potthast (2022). Same Sentiment Classification Train/Dev/Test Pair IDs [Dataset]. http://doi.org/10.5281/zenodo.5495793
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 14, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Erik Körner; Erik Körner; Ahmad Dawar Hakimi; Ahmad Dawar Hakimi; Gerhard Heyer; Gerhard Heyer; Martin Potthast; Martin Potthast
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This "dataset" only includes the compiled pairings of the Yelp Business Review Dataset. To get access to the actual review texts, please follow the instructions on the Yelp Dataset webpage.

    The data format is JSONlines.
    Python Load Example:

    import pandas as pd
    traindev_df = pd.read_json("df_traindev.jsonl", lines=True)
    test_df = pd.read_json("df_test.jsonl", lines=True)
    
    # example access to single business/review id
    s1_bid = test_df.iloc[0]["sent1_business_id"]
    s1_rid = test_df.iloc[0]["sent1_review_id"]
    s2_bid = test_df.iloc[0]["sent2_business_id"]
    s2_rid = test_df.iloc[0]["sent2_review_id"]
    label = test_df.iloc[0]["is_same_side"]

    See documentation at:

    For details on how the data was compiled and used in our experiments, please refer to our code repository. Other derived data splits can be reproduced deterministically by using the same random seed as in our experiments.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
yelp_polarity_reviews [Dataset]. https://www.tensorflow.org/datasets/catalog/yelp_polarity_reviews

yelp_polarity_reviews

Explore at:
7 scholarly articles cite this dataset (View in Google Scholar)
Description

Large Yelp Review Dataset. This is a dataset for binary sentiment classification. We provide a set of 560,000 highly polar yelp reviews for training, and 38,000 for testing. ORIGIN The Yelp reviews dataset consists of reviews from Yelp. It is extracted from the Yelp Dataset Challenge 2015 data. For more information, please refer to http://www.yelp.com/dataset

The Yelp reviews polarity dataset is constructed by Xiang Zhang (xiang.zhang@nyu.edu) from the above dataset. It is first used as a text classification benchmark in the following paper: Xiang Zhang, Junbo Zhao, Yann LeCun. Character-level Convolutional Networks for Text Classification. Advances in Neural Information Processing Systems 28 (NIPS 2015).

DESCRIPTION

The Yelp reviews polarity dataset is constructed by considering stars 1 and 2 negative, and 3 and 4 positive. For each polarity 280,000 training samples and 19,000 testing samples are take randomly. In total there are 560,000 trainig samples and 38,000 testing samples. Negative polarity is class 1, and positive class 2.

The files train.csv and test.csv contain all the training samples as comma-sparated values. There are 2 columns in them, corresponding to class index (1 and 2) and review text. The review texts are escaped using double quotes ("), and any internal double quote is escaped by 2 double quotes (""). New lines are escaped by a backslash followed with an "n" character, that is " ".

To use this dataset:

import tensorflow_datasets as tfds

ds = tfds.load('yelp_polarity_reviews', split='train')
for ex in ds.take(4):
 print(ex)

See the guide for more informations on tensorflow_datasets.

Search
Clear search
Close search
Google apps
Main menu