100+ datasets found

yelp_review_full
huggingface.co
Updated Mar 6, 2012
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yelp (2012). yelp_review_full [Dataset]. https://huggingface.co/datasets/Yelp/yelp_review_full
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 6, 2012
Dataset authored and provided by
Yelphttp://yelp.com/
License
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Description
Dataset Card for YelpReviewFull

Dataset Summary

The Yelp reviews dataset consists of reviews from Yelp. It is extracted from the Yelp Dataset Challenge 2015 data.

Supported Tasks and Leaderboards

text-classification, sentiment-classification: The dataset is mainly used for text classification: given the text, predict the sentiment.

Languages

The reviews were mainly written in english.

Dataset Structure Data Instances

A… See the full description on the dataset page: https://huggingface.co/datasets/Yelp/yelp_review_full.
Processed YELP Dataset
kaggle.com
zip
Updated Dec 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nguyen Hieu Xt (2024). Processed YELP Dataset [Dataset]. https://www.kaggle.com/datasets/nguyenhieuxt/processed-yelp-dataset
Explore at:
zip(3655803662 bytes)Available download formats
Dataset updated
Dec 16, 2024
Authors
Nguyen Hieu Xt
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
This dataset is derived from the YELP Dataset, with the following preprocessing steps applied:

All JSON files in the raw dataset have been converted to CSV format.

The JSON file containing review data has been split into two subsets: the training set (containing data from 2014 to 2019) and the testing set (containing data from 2020 to 2021).

Columns in the user data that begin with the prefix "compliment_" have been removed. A new column, "num_compliments" has been created, representing the sum of all values from the removed columns.

The JSON files containing tip and check-in information have been removed, as they are deemed unnecessary for the review-based recommendation system.
d
Grepsr| Yelp Resturants Address and Reviews Data | Global Coverage with...
datarade.ai
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Grepsr, Grepsr| Yelp Resturants Address and Reviews Data | Global Coverage with Custom and On-demand Datasets [Dataset]. https://datarade.ai/data-products/grepsr-yelp-resturants-address-and-reviews-data-global-cov-grepsr
Explore at:
.bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
Dataset authored and provided by
Grepsr
Area covered
Venezuela (Bolivarian Republic of), Anguilla, Ethiopia, Turkey, Sudan, Iran (Islamic Republic of), Latvia, Saint Lucia, Gambia, United Arab Emirates
Description
Use cases that can be supported with Yelp Reviews

A. Market Research and Analysis: Leverage Yelp data to conduct comprehensive market research and analysis in the restaurant industry. Identify emerging culinary trends, popular cuisines, and customer preferences. Gain a competitive edge by understanding your target audience's needs and expectations.

B. Competitor Analysis: Compare and contrast your restaurant with competitors on Yelp. Analyze their ratings, customer reviews, and performance metrics to identify strengths and weaknesses. Use these insights to enhance your offerings and stand out in the market.

C. Reputation Management: Monitor and manage your restaurant's online reputation effectively. Track and analyze customer reviews and ratings on Yelp to identify improvement areas and promptly address negative feedback. Positive reviews can be leveraged for marketing and branding purposes.

D. Pricing and Revenue Optimization: Leverage the Yelp dataset to analyze pricing strategies and revenue trends in the restaurant sector. Understand seasonal demand fluctuations, pricing patterns, and revenue optimization opportunities to maximize your restaurant's profitability.

E. Customer Sentiment Analysis: Conduct sentiment analysis on Yelp reviews to gauge customer satisfaction and sentiment towards your restaurant. Use this information to improve dining experiences, address pain points, and enhance overall customer satisfaction.

F. Content Marketing and SEO: Create compelling content for your restaurant's website based on popular keywords, cuisines, and dining preferences identified in the Yelp dataset. Optimize your content to improve search engine rankings and attract more potential diners.

G. Personalized Marketing Campaigns: Use Yelp data to segment your target audience based on dining preferences, food habits, and demographics. Develop personalized marketing campaigns that resonate with different customer segments, resulting in higher engagement and repeat business.

H. Investment and Expansion Decisions: Access historical and real-time data on restaurant performance and market dynamics from Yelp. Utilize this information to make data-driven investment decisions, identify potential areas for expansion, and assess the feasibility of new culinary ventures.

I. Predictive Analytics: Utilize the Yelp dataset to build predictive models that forecast future trends in the restaurant industry. Anticipate shifts in culinary preferences, understand customer behavior, and make proactive decisions to stay ahead of the competition.

J. Business Intelligence Dashboards: Create interactive and insightful dashboards that visualize key performance metrics from the Yelp dataset. These dashboards can help restaurant executives and stakeholders get a quick overview of the restaurant's performance and make data-driven decisions.

Incorporating the Yelp dataset into your business processes will enhance your understanding of the restaurant market, facilitate data-driven decision-making, and provide valuable insights to drive success in the competitive culinary industry.
yelp open data philly restaurants
kaggle.com
zip
Updated Apr 14, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cade Apple (2024). yelp open data philly restaurants [Dataset]. https://www.kaggle.com/datasets/capple7/yelp-open-data-philly-restaurants
Explore at:
zip(5268181410 bytes)Available download formats
Dataset updated
Apr 14, 2024
Authors
Cade Apple
Description
YELP DATASET TERMS OF USE Last Updated: February 16, 2021 This document (“Data Agreement”) governs the terms under which you may access and use the data that Yelp makes available for download through this website (or made available by other means) solely for academic or non-commercial purposes (the “Data”). Yelp Terms of Service: By accessing or using the Data, you agree to be bound by the Data Agreement and represent that the contact information you provide to Yelp is correct. If you access or use the Data on behalf of a university, school, or other entity, you represent that you have authority to bind such entity and its affiliates to the Data Agreement and that it is fully binding upon them. In such a case, the term “you” and “your” will refer to such an entity and its affiliates. If you do not have authority, or if you do not agree with the terms of the Data Agreement, you may not access or use the Data. You should read and keep a copy of each component of the Data Agreement for your records. In the event of a conflict among them, the terms of this document will control. 1. Purpose The Data is made available by Yelp Inc. (“Yelp”) to enable you to access valuable local information to develop an academic project as part of an ongoing course of study or for non-commercial purposes. With this in mind, Yelp reserves the right to continually review and evaluate all uses of the Data provided under the Data Agreement. Under certain circumstances, Yelp may authorize limited commercial use under certain circumstances, for example, access and use by journalists to explore our data to generate ideas prior to formal data access requests from Yelp’s PR department. 2. Changes Yelp reserves the right to modify or revise the Data Agreement at any time. If the change is deemed to be material and it is foreseeable that such change could be adverse to your interests, Yelp will provide you notice of the change to this Data Agreement by sending you an email to the email you provided to Yelp. Your continued use of the Data after the notice of material change will constitute your acceptance of and agreement to such changes. IF YOU DO NOT WISH TO BE BOUND TO ANY NEW TERMS, YOU MUST TERMINATE THE DATA AGREEMENT BY IMMEDIATELY CEASING USE OF THE DATA AND DELETING IT FROM ANY SYSTEMS OR MEDIA. 3. License Subject to the terms set forth in the Data Agreement (specifically the restrictions set forth in Section 4 below), Yelp grants you a royalty-free, non-exclusive, revocable, non-sublicensable, non-transferable, fully paid-up right and license during the Term to use, access, and create derivative works of the Data in electronic form for solely for non-commercial use.. Non-commercial use means use of the Data by registered nonprofits, government, educational institutions, and think tanks which (a) is not undertaken for profit, or (b) is not intended to produce works, services, or data for commercial use. You may not use the Data for any other purpose without Yelp’s prior written consent. You acknowledge and agree that Yelp may request information about, review, audit, and/or monitor your use of the Data at any time in order to confirm compliance with the Data Agreement. Nothing herein shall be construed as a license to use Yelp’s registered trademarks or service marks, or any other Yelp branding. Prior to any public presentation or publication of the academic results or conclusions that involve the Data and/or the Yelp brand name, you must submit your findings to Yelp for review and approval, and Yelp will approve of the public release within five (5) business days of its submission to Yelp. 4. Restrictions You agree that you will not, and will not encourage, assist, or enable others to: A. display, perform, or distribute any of the Data, or use the Data to update or create your own business listing information for commercial purposes (i.e. you may not publicly display any of the Data to any third party, especially reviews and other user generated content, as this is a private data set challenge and not a license to compete with or disparage with Yelp); B. use the Data in connection with any commercial purpose; C. use the Data in any manner or for any purpose that may violate any law or regulation, or any right of any person including, but not limited to, intellectual property rights, rights of privacy and/or rights of personality, or which otherwise may be harmful (in Yelp's sole discretion) to Yelp, its providers, its suppliers, end users of this website, or your end users; D. use the Data on behalf of any third party without Yelp’s consent; E. create, redistribute or disclose any summary of, or metrics related to, the Data (e.g., the number of reviewed business included in the Data and other statistical analysis) to any third party or on any website or other electronic media not expressly covered by this Agreement or without Yelp’s ...
Yelp Dataset
kaggle.com
zip
Updated Mar 17, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yelp, Inc. (2022). Yelp Dataset [Dataset]. https://www.kaggle.com/datasets/yelp-dataset/yelp-dataset/code
Explore at:
zip(4374983563 bytes)Available download formats
Dataset updated
Mar 17, 2022
Dataset provided by
Yelphttp://yelp.com/
Authors
Yelp, Inc.
Description
Context

This dataset is a subset of Yelp's businesses, reviews, and user data. It was originally put together for the Yelp Dataset Challenge which is a chance for students to conduct research or analysis on Yelp's data and share their discoveries. In the most recent dataset you'll find information about businesses across 8 metropolitan areas in the USA and Canada.

Content

This dataset contains five JSON files and the user agreement. More information about those files can be found here.

Code snippet to read the files

in Python, you can read the JSON files like this (using the json and pandas libraries):

import json import pandas as pd data_file = open("yelp_academic_dataset_checkin.json") data = [] for line in data_file: data.append(json.loads(line)) checkin_df = pd.DataFrame(data) data_file.close()
r
YELP
resodate.org
service.tib.eu
Updated Dec 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yueen Ma; Dafeng Chi; Jingjing Li; Kai Song; Yuzheng Zhuang; Irwin King (2024). YELP [Dataset]. https://resodate.org/resources/aHR0cHM6Ly9zZXJ2aWNlLnRpYi5ldS9sZG1zZXJ2aWNlL2RhdGFzZXQveWVscA==
Explore at:
Dataset updated
Dec 3, 2024
Dataset provided by
Leibniz Data Manager
Authors
Yueen Ma; Dafeng Chi; Jingjing Li; Kai Song; Yuzheng Zhuang; Irwin King
Description
The YELP dataset is used for language modeling.
r
Yelp Reviews
resodate.org
service.tib.eu
Updated Dec 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jiacheng Xu; Danlu Chen; Xipeng Qiu; Xuangjing Huang (2024). Yelp Reviews [Dataset]. https://resodate.org/resources/aHR0cHM6Ly9zZXJ2aWNlLnRpYi5ldS9sZG1zZXJ2aWNlL2RhdGFzZXQveWVscC1yZXZpZXdz
Explore at:
Dataset updated
Dec 2, 2024
Dataset provided by
Leibniz Data Manager
Authors
Jiacheng Xu; Danlu Chen; Xipeng Qiu; Xuangjing Huang
Description
Yelp Reviews is a large dataset of customer reviews.
Yelp Inc. Alternative Data Analytics
meyka.com
Updated Sep 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Meyka (2025). Yelp Inc. Alternative Data Analytics [Dataset]. https://meyka.com/stock/YELP/alt-data/
Explore at:
Dataset updated
Sep 24, 2025
Dataset provided by
Description
Non-traditional data signals from social media and employment platforms for YELP stock analysis
e
yelp.com Traffic Analytics Data
analytics.explodingtopics.com
Updated Sep 1, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). yelp.com Traffic Analytics Data [Dataset]. https://analytics.explodingtopics.com/website/yelp.com
Explore at:
Dataset updated
Sep 1, 2025
Variables measured
Global Rank, Monthly Visits, Authority Score, US Country Rank, Online Services Category Rank
Description
Traffic analytics, rankings, and competitive metrics for yelp.com as of September 2025
h
yelp-04-2024
huggingface.co
Updated Sep 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Adam Amer (2024). yelp-04-2024 [Dataset]. https://huggingface.co/datasets/adamamer20/yelp-04-2024
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 12, 2024
Authors
Adam Amer
License
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Description
Yelp Complete Open Dataset 04.2024

Dataset Description

This dataset contains the complete Yelp Open Dataset, a rich collection of user reviews, business information, and user data. It is a valuable resource for tasks such as sentiment analysis, recommendation systems, and other natural language processing (NLP) projects.

Source

The dataset is provided by Yelp and is publicly available under the Yelp Dataset Terms of Use.

Dataset Structure

The dataset… See the full description on the dataset page: https://huggingface.co/datasets/adamamer20/yelp-04-2024.
🏪Yelp Reviews for Senti-Analysis Binary -N/P+
kaggle.com
zip
Updated May 9, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yassir Acharki (2022). 🏪Yelp Reviews for Senti-Analysis Binary -N/P+ [Dataset]. https://www.kaggle.com/datasets/yacharki/yelp-reviews-for-sentianalysis-binary-np-csv
Explore at:
zip(169583717 bytes)Available download formats
Dataset updated
May 9, 2022
Authors
Yassir Acharki
License
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
Description
The Yelp reviews polarity dataset is constructed by considering stars 1 and 2 negative, and 3 and 4 positive. For each polarity 280,000 training samples and 19,000 testing samples are take randomly. In total there are 560,000 trainig samples and 38,000 testing samples. Negative polarity is class 1, and positive class 2.

The files train.csv and test.csv contain all the training samples as comma-sparated values. There are 2 columns in them, corresponding to class index (1 and 2) and review text. The review texts are escaped using double quotes ("), and any internal double quote is escaped by 2 double quotes (""). New lines are escaped by a backslash followed with an "n" character, that is " ".
d
Dataplex: Google Reviews & Ratings Dataset | Track Consumer Sentiment &...
datarade.ai
.json, .csv
Updated Feb 3, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataplex (2025). Dataplex: Google Reviews & Ratings Dataset | Track Consumer Sentiment & Location-Based Insights [Dataset]. https://datarade.ai/data-products/dataplex-google-reviews-ratings-dataset-track-consumer-s-dataplex
Explore at:
.json, .csvAvailable download formats
Dataset updated
Feb 3, 2025
Dataset authored and provided by
Dataplex
Area covered
United States
Description
The Google Reviews & Ratings Dataset provides businesses with structured insights into customer sentiment, satisfaction, and trends based on reviews from Google. Unlike broad review datasets, this product is location-specific—businesses provide the locations they want to track, and we retrieve as much historical data as possible, with daily updates moving forward.

This dataset enables businesses to monitor brand reputation, analyze consumer feedback, and enhance decision-making with real-world insights. For deeper analysis, optional AI-driven sentiment analysis and review summaries are available on a weekly, monthly, or yearly basis.

Dataset Highlights

Location-Specific Reviews – Reviews and ratings for the locations you provide.

Daily Updates – New reviews and rating changes updated automatically.

Historical Data Access – Retrieve past reviews where available.

AI Sentiment Analysis (Optional) – Summarized insights by week, month, or year.

Competitive Benchmarking – Compare performance across selected locations.

Use Cases

Franchise & Retail Chains – Monitor brand reputation and performance across locations.

Hospitality & Restaurants – Track guest sentiment and service trends.

Healthcare & Medical Facilities – Understand patient feedback for specific locations.

Real Estate & Property Management – Analyze tenant and customer experiences through reviews.

Market Research & Consumer Insights – Identify trends and analyze feedback patterns across industries.

Data Updates & Delivery

Update Frequency: Daily

Data Format: CSV for easy integration

Delivery: Secure file transfer (SFTP or cloud storage)

Data Fields Include:

Business Name

Location Details

Star Ratings

Review Text

Timestamps

Reviewer Metadata

Optional Add-Ons:

AI Sentiment Analysis – Aggregate trends by week, month, or year.

Custom Location Tracking – Tailor the dataset to fit your specific business needs.

Ideal for

Marketing Teams – Leverage real-world consumer feedback to optimize brand strategy.

Business Analysts – Use structured review data to track customer sentiment over time.

Operations & Customer Experience Teams – Identify service issues and opportunities for improvement.

Competitive Intelligence – Compare locations and benchmark against industry competitors.

Why Choose This Dataset?

Accurate & Up-to-Date – Daily updates ensure fresh, reliable data.

Scalable & Customizable – Track only the locations that matter to you.

Actionable Insights – AI-driven summaries for quick decision-making.

Easy Integration – Delivered in a structured format for seamless analysis.

By leveraging Google Reviews & Ratings Data, businesses can gain valuable insights into customer sentiment, enhance reputation management, and stay ahead of the competition.
c
Walmart reviews data in JSON format
crawlfeeds.com
json, zip
Updated Aug 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Crawl Feeds (2024). Walmart reviews data in JSON format [Dataset]. https://crawlfeeds.com/datasets/walmart-reviews-data-in-json-format
Explore at:
zip, jsonAvailable download formats
Dataset updated
Aug 26, 2024
Dataset authored and provided by
Crawl Feeds
License
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Description
Walmart Product Reviews Dataset provides an extensive collection of customer feedback that can be pivotal for businesses aiming to understand consumer preferences and behaviors. This dataset includes detailed information such as ratings, reviews, and timestamps, making it an invaluable resource for data analysts and market researchers. By analyzing the data, companies can identify trends, detect potential issues with products, and gauge overall customer satisfaction. Whether you're looking to optimize product offerings or enhance customer service, this dataset is a goldmine of actionable insights.

Leveraging the Walmart Ratings and Reviews Dataset for Competitive Advantage

Utilizing the Walmart Ratings and Reviews Dataset allows businesses to stay ahead of the competition by tapping into authentic customer experiences. This dataset is particularly useful for sentiment analysis, enabling companies to discern the emotional tone behind customer reviews. By doing so, businesses can refine their marketing strategies, address customer concerns proactively, and improve product development processes. Moreover, integrating this dataset with other data sources can provide a comprehensive view of market dynamics, helping companies make informed, data-driven decisions.

Walmart ratings and reviews dataset. Last extracted on 16 aug 2022
c
Unlocking User Sentiment: The App Store Reviews Dataset
crawlfeeds.com
json, zip
Updated Jun 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Crawl Feeds (2025). Unlocking User Sentiment: The App Store Reviews Dataset [Dataset]. https://crawlfeeds.com/datasets/app-store-reviews-dataset
Explore at:
json, zipAvailable download formats
Dataset updated
Jun 20, 2025
Dataset authored and provided by
Crawl Feeds
License
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Description
This dataset offers a focused and invaluable window into user perceptions and experiences with applications listed on the Apple App Store. It is a vital resource for app developers, product managers, market analysts, and anyone seeking to understand the direct voice of the customer in the dynamic mobile app ecosystem.

Dataset Specifications:

Investment: $45.0

Status: Published and immediately available.

Category: Ratings and Reviews Data

Format: Compressed ZIP archive containing JSON files, ensuring easy integration into your analytical tools and platforms.

Volume: Comprises 10,000 unique app reviews, providing a robust sample for qualitative and quantitative analysis of user feedback.

Timeliness: Last crawled: (This field is blank in your provided info, which means its recency is currently unknown. If this were a real product, specifying this would be critical for its value proposition.)

Richness of Detail (11 Comprehensive Fields):

Each record in this dataset provides a detailed breakdown of a single App Store review, enabling multi-dimensional analysis:

Review Content:

review: The full text of the user's written feedback, crucial for Natural Language Processing (NLP) to extract themes, sentiment, and common keywords.

title: The title given to the review by the user, often summarizing their main point.

isEdited: A boolean flag indicating whether the review has been edited by the user since its initial submission. This can be important for tracking evolving sentiment or understanding user behavior.

Reviewer & Rating Information:

username: The public username of the reviewer, allowing for analysis of engagement patterns from specific users (though not personally identifiable).

rating: The star rating (typically 1-5) given by the user, providing a quantifiable measure of satisfaction.

App & Origin Context:

app_name: The name of the application being reviewed.

app_id: A unique identifier for the application within the App Store, enabling direct linking to app details or other datasets.

country: The country of the App Store storefront where the review was left, allowing for geographic segmentation of feedback.

Metadata & Timestamps:

_id: A unique identifier for the specific review record in the dataset.

crawled_at: The timestamp indicating when this particular review record was collected by the data provider (Crawl Feeds).

date: The original date the review was posted by the user on the App Store.

Expanded Use Cases & Analytical Applications:

This dataset is a goldmine for understanding what users truly think and feel about mobile applications. Here's how it can be leveraged:

Product Development & Improvement:

Bug Detection & Prioritization: Analyze negative review text to identify recurring technical issues, crashes, or bugs, allowing developers to prioritize fixes based on user impact.

Feature Requests & Roadmap Prioritization: Extract feature suggestions from positive and neutral review text to inform future product roadmap decisions and develop features users actively desire.

User Experience (UX) Enhancement: Understand pain points related to app design, navigation, and overall usability by analyzing common complaints in the review field.

Version Impact Analysis: If integrated with app version data, track changes in rating and sentiment after new app updates to assess the effectiveness of bug fixes or new features.

Market Research & Competitive Intelligence:

Competitor Benchmarking: Analyze reviews of competitor apps (if included or combined with similar datasets) to identify their strengths, weaknesses, and user expectations within a specific app category.

Market Gap Identification: Discover unmet user needs or features that users desire but are not adequately provided by existing apps.

Niche Opportunities: Identify specific use cases or user segments that are underserved based on recurring feedback.

Marketing & App Store Optimization (ASO):

Sentiment Analysis: Perform sentiment analysis on the review and title fields to gauge overall user satisfaction, pinpoint specific positive and negative aspects, and track sentiment shifts over time.

Keyword Optimization: Identify frequently used keywords and phrases in reviews to optimize app store listings, improving discoverability and search ranking.

Messaging Refinement: Understand how users describe and use the app in their own words, which can inform marketing copy and advertising campaigns.

Reputation Management: Monitor rating trends and identify critical reviews quickly to facilitate timely responses and proactive customer engagement.

Academic & Data Science Research:

Natural Language Processing (NLP): The review and title fields are excellent for training and testing NLP models for sentiment analysis, topic modeling, named entity recognition, and text summarization.

User Behavior Analysis: Study patterns in rating distribution, isEdited status, and date to understand user engagement and feedback cycles.

Cross-Country Comparisons: Analyze country-specific reviews to understand regional differences in app perception, feature preferences, or cultural nuances in feedback.

This App Store Reviews dataset provides a direct, unfiltered conduit to understanding user needs and ultimately driving better app performance and greater user satisfaction. Its structured format and granular detail make it an indispensable asset for data-driven decision-making in the mobile app industry.
d
Product and Price Data, Product Reviews Data from Google Shopping |...
datarade.ai
.json, .csv
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
OpenWeb Ninja, Product and Price Data, Product Reviews Data from Google Shopping | Ecommerce Data | Real-Time API [Dataset]. https://datarade.ai/data-products/openweb-ninja-product-data-product-reviews-data-more-fro-openweb-ninja
Explore at:
.json, .csvAvailable download formats
Dataset authored and provided by
OpenWeb Ninja
Area covered
Yemen, Namibia, Martinique, Libya, Réunion, Mexico, Kosovo, Taiwan, Nigeria, Guinea
Description
OpenWeb Ninja's Product Data API provides Product Data, Product Reviews Data, Product Offers, sourced in real-time from Google Shopping - the largest product listings aggregate on the web, listing products from all publicly available e-commerce sites (Amazon, eBay, Walmart + many others).

The API covers more than 35 billion Product Data Listings, including Product Reviews and Product Offers across the web. The API provides over 40 product data points including prices, rating and reviews insights, product details and specs, typical price ranges, and more.

OpenWeb Ninja's Product Data common use cases: - Price Optimization & Price Comparison - Market Research & Competitive Analysis - Product Research & Trend Analysis - Customer Reviews Analysis

OpenWeb Ninja's Product Data Stats & Capabilities: - 35B+ Product Listings - 40+ data points per job listing - Global aggregate - Search by keyword or GTIN/EAN
yelp+amazon+imdb REVIEWS
kaggle.com
zip
Updated May 9, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Akash Kumar (2022). yelp+amazon+imdb REVIEWS [Dataset]. https://www.kaggle.com/datasets/akashkumar01/yelpamazonimdb
Explore at:
zip(82239 bytes)Available download formats
Dataset updated
May 9, 2022
Authors
Akash Kumar
Description
Review dataset from Amazon, Yelp and Imdb. This dataset can be used for NLP sentiment Analysis. LEVEL: Beginner

'1' is a Positive sentiment '0' is a Negative sentiment
H
Replication Data for: "Authentic and amazing": authenticity as an evaluative...
dataverse.harvard.edu
Updated Feb 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dominick Boyle (2024). Replication Data for: "Authentic and amazing": authenticity as an evaluative category in online consumer restaurant reviews. [Dataset]. http://doi.org/10.7910/DVN/9JVSMI
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/9JVSMI
Dataset updated
Feb 12, 2024
Dataset provided by
Harvard Dataverse
Authors
Dominick Boyle
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This dataset corresponds to the paper "'Authentic and amazing': authenticity as an evaluative category in online consumer restaurant reviews" appearing in Cultural Analytics. This dataset provides the R scripts used for the preparation, analysis as well as the import of data to Sketch Engine, the ID lists of the reviews in Corpus 1, 2 and 3, as well as the authenticity lexicons used which were derived from O'Connor et. al (2017) under a CC BY 4.0 license. The IDs correspond the those in the Yelp Dataset at the time of data collection (2019).
d
Replication Data for: \"A Topic-based Segmentation Model for Identifying...
search.dataone.org
Updated Sep 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kim, Sunghoon; Lee, Sanghak; McCulloch, Robert (2024). Replication Data for: \"A Topic-based Segmentation Model for Identifying Segment-Level Drivers of Star Ratings from Unstructured Text Reviews\" [Dataset]. http://doi.org/10.7910/DVN/EE3DE2
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/EE3DE2
Dataset updated
Sep 25, 2024
Dataset provided by
Harvard Dataverse
Authors
Kim, Sunghoon; Lee, Sanghak; McCulloch, Robert
Description
We provide instructions, codes and datasets for replicating the article by Kim, Lee and McCulloch (2024), "A Topic-based Segmentation Model for Identifying Segment-Level Drivers of Star Ratings from Unstructured Text Reviews." This repository provides a user-friendly R package for any researchers or practitioners to apply A Topic-based Segmentation Model with Unstructured Texts (latent class regression with group variable selection) to their datasets. First, we provide a R code to replicate the illustrative simulation study: see file 1. Second, we provide the user-friendly R package with a very simple example code to help apply the model to real-world datasets: see file 2, Package_MixtureRegression_GroupVariableSelection.R and Dendrogram.R. Third, we provide a set of codes and instructions to replicate the empirical studies of customer-level segmentation and restaurant-level segmentation with Yelp reviews data: see files 3-a, 3-b, 4-a, 4-b. Note, due to the dataset terms of use by Yelp and the restriction of data size, we provide the link to download the same Yelp datasets (https://www.kaggle.com/datasets/yelp-dataset/yelp-dataset/versions/6). Fourth, we provided a set of codes and datasets to replicate the empirical study with professor ratings reviews data: see file 5. Please see more details in the description text and comments of each file. [A guide on how to use the code to reproduce each study in the paper] 1. Full codes for replicating Illustrative simulation study.txt -- [see Table 2 and Figure 2 in main text]: This is R source code to replicate the illustrative simulation study. Please run from the beginning to the end in R. In addition to estimated coefficients (posterior means of coefficients), indicators of variable selections, and segment memberships, you will get dendrograms of selected groups of variables in Figure 2. Computing time is approximately 20 to 30 minutes 3-a. Preprocessing raw Yelp Reviews for Customer-level Segmentation.txt: Code for preprocessing the downloaded unstructured Yelp review data and preparing DV and IVs matrix for customer-level segmentation study. 3-b. Instruction for replicating Customer-level Segmentation analysis.txt -- [see Table 10 in main text; Tables F-1, F-2, and F-3 and Figure F-1 in Web Appendix]: Code for replicating customer-level segmentation study with Yelp data. You will get estimated coefficients (posterior means of coefficients), indicators of variable selections, and segment memberships. Computing time is approximately 3 to 4 hours. 4-a. Preprocessing raw Yelp reviews_Restaruant Segmentation (1).txt: R code for preprocessing the downloaded unstructured Yelp data and preparing DV and IVs matrix for restaurant-level segmentation study. 4-b. Instructions for replicating restaurant-level segmentation analysis.txt -- [see Tables 5, 6 and 7 in main text; Tables E-4 and E-5 and Figure H-1 in Web Appendix]: Code for replicating restaurant-level segmentation study with Yelp. you will get estimated coefficients (posterior means of coefficients), indicators of variable selections, and segment memberships. Computing time is approximately 10 to 12 hours. [Guidelines for running Benchmark models in Table 6] Unsupervised Topic model: 'topicmodels' package in R -- after determining the number of topics(e.g., with 'ldatuning' R package), run 'LDA' function in the 'topicmodels'package. Then, compute topic probabilities per restaurant (with 'posterior' function in the package) which can be used as predictors. Then, conduct prediction with regression Hierarchical topic model (HDP): 'gensimr' R package -- 'model_hdp' function for identifying topics in the package (see https://radimrehurek.com/gensim/models/hdpmodel.html or https://gensimr.news-r.org/). Supervised topic model: 'lda' R package -- 'slda.em' function for training and 'slda.predict' for prediction. Aggregate regression: 'lm' default function in R. Latent class regression without variable selection: 'flexmix' function in 'flexmix' R package. Run flexmix with a certain number of segments (e.g., 3 segments in this study). Then, with estimated coefficients and memberships, conduct prediction of dependent variable per each segment. Latent class regression with variable selection: 'Unconstraind_Bayes_Mixture' function in Kim, Fong and DeSarbo(2012)'s package. Run the Kim et al's model (2012) with a certain number of segments (e.g., 3 segments in this study). Then, with estimated coefficients and memberships, we can do prediction of dependent variables per each segment. The same R package ('KimFongDeSarbo2012.zip') can be downloaded at: https://sites.google.com/scarletmail.rutgers.edu/r-code-packages/home 5. Instructions for replicating Professor ratings review study.txt -- [see Tables G-1, G-2, G-4 and G-5, and Figures G-1 and H-2 in Web Appendix]: Code to replicate the Professor ratings reviews study. Computing time is approximately 10 hours. [A list of the versions of R, packages, and computer...
Datasets for Sentiment Analysis
zenodo.org
csv
Updated Dec 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Julie R. Repository creator - Campos Arias; Julie R. Repository creator - Campos Arias (2023). Datasets for Sentiment Analysis [Dataset]. http://doi.org/10.5281/zenodo.10157504
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10157504
Dataset updated
Dec 10, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Julie R. Repository creator - Campos Arias; Julie R. Repository creator - Campos Arias
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository was created for my Master's thesis in Computational Intelligence and Internet of Things at the University of Córdoba, Spain. The purpose of this repository is to store the datasets found that were used in some of the studies that served as research material for this Master's thesis. Also, the datasets used in the experimental part of this work are included.
Below are the datasets specified, along with the details of their references, authors, and download sources.

----------- STS-Gold Dataset ----------------
The dataset consists of 2026 tweets. The file consists of 3 columns: id, polarity, and tweet. The three columns denote the unique id, polarity index of the text and the tweet text respectively.
Reference: Saif, H., Fernandez, M., He, Y., & Alani, H. (2013). Evaluation datasets for Twitter sentiment analysis: a survey and a new dataset, the STS-Gold.
File name: sts_gold_tweet.csv
----------- Amazon Sales Dataset ----------------
This dataset is having the data of 1K+ Amazon Product's Ratings and Reviews as per their details listed on the official website of Amazon. The data was scraped in the month of January 2023 from the Official Website of Amazon.
Owner: Karkavelraja J., Postgraduate student at Puducherry Technological University (Puducherry, Puducherry, India)
Features:
product_id - Product ID
product_name - Name of the Product
category - Category of the Product
discounted_price - Discounted Price of the Product
actual_price - Actual Price of the Product
discount_percentage - Percentage of Discount for the Product
rating - Rating of the Product
rating_count - Number of people who voted for the Amazon rating
about_product - Description about the Product
user_id - ID of the user who wrote review for the Product
user_name - Name of the user who wrote review for the Product
review_id - ID of the user review
review_title - Short review
review_content - Long review
img_link - Image Link of the Product
product_link - Official Website Link of the Product
License: CC BY-NC-SA 4.0
File name: amazon.csv
----------- Rotten Tomatoes Reviews Dataset ----------------
This rating inference dataset is a sentiment classification dataset, containing 5,331 positive and 5,331 negative processed sentences from Rotten Tomatoes movie reviews. On average, these reviews consist of 21 words. The first 5331 rows contains only negative samples and the last 5331 rows contain only positive samples, thus the data should be shuffled before usage.
This data is collected from https://www.cs.cornell.edu/people/pabo/movie-review-data/ as a txt file and converted into a csv file. The file consists of 2 columns: reviews and labels (1 for fresh (good) and 0 for rotten (bad)).
Reference: Bo Pang and Lillian Lee. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05), pages 115–124, Ann Arbor, Michigan, June 2005. Association for Computational Linguistics
File name: data_rt.csv
----------- Preprocessed Dataset Sentiment Analysis ----------------
Preprocessed amazon product review data of Gen3EcoDot (Alexa) scrapped entirely from amazon.in
Stemmed and lemmatized using nltk.
Sentiment labels are generated using TextBlob polarity scores.
The file consists of 4 columns: index, review (stemmed and lemmatized review using nltk), polarity (score) and division (categorical label generated using polarity score).
DOI: 10.34740/kaggle/dsv/3877817
Citation: @misc{pradeesh arumadi_2022, title={Preprocessed Dataset Sentiment Analysis}, url={https://www.kaggle.com/dsv/3877817}, DOI={10.34740/KAGGLE/DSV/3877817}, publisher={Kaggle}, author={Pradeesh Arumadi}, year={2022} }
This dataset was used in the experimental phase of my research.
File name: EcoPreprocessed.csv
----------- Amazon Earphones Reviews ----------------
This dataset consists of a 9930 Amazon reviews, star ratings, for 10 latest (as of mid-2019) bluetooth earphone devices for learning how to train Machine for sentiment analysis.
This dataset was employed in the experimental phase of my research. To align it with the objectives of my study, certain reviews were excluded from the original dataset, and an additional column was incorporated into this dataset.
The file consists of 5 columns: ReviewTitle, ReviewBody, ReviewStar, Product and division (manually added - categorical label generated using ReviewStar score)
License: U.S. Government Works
Source: www.amazon.in
File name (original): AllProductReviews.csv (contains 14337 reviews)
File name (edited - used for my research) : AllProductReviews2.csv (contains 9930 reviews)
----------- Amazon Musical Instruments Reviews ----------------
This dataset contains 7137 comments/reviews of different musical instruments coming from Amazon.
This dataset was employed in the experimental phase of my research. To align it with the objectives of my study, certain reviews were excluded from the original dataset, and an additional column was incorporated into this dataset.
The file consists of 10 columns: reviewerID, asin (ID of the product), reviewerName, helpful (helpfulness rating of the review), reviewText, overall (rating of the product), summary (summary of the review), unixReviewTime (time of the review - unix time), reviewTime (time of the review (raw) and division (manually added - categorical label generated using overall score).
Source: http://jmcauley.ucsd.edu/data/amazon/
File name (original): Musical_instruments_reviews.csv (contains 10261 reviews)
File name (edited - used for my research) : Musical_instruments_reviews2.csv (contains 7137 reviews)
t
IMDB and Yelp datasets - Dataset - LDM
service.tib.eu
Updated Dec 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). IMDB and Yelp datasets - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/imdb-and-yelp-datasets
Explore at:
Dataset updated
Dec 16, 2024
Description
IMDB and Yelp are datasets used for sentiment analysis and author identification.

Facebook

Twitter

Click to copy link

Link copied

Cite

Yelp (2012). yelp_review_full [Dataset]. https://huggingface.co/datasets/Yelp/yelp_review_full

yelp_review_full

YelpReviewFull

Yelp/yelp_review_full

Explore at:

67 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Mar 6, 2012

Dataset authored and provided by

Yelphttp://yelp.com/

License

https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

Description

Dataset Card for YelpReviewFull

  Dataset Summary

The Yelp reviews dataset consists of reviews from Yelp. It is extracted from the Yelp Dataset Challenge 2015 data.

  Supported Tasks and Leaderboards

text-classification, sentiment-classification: The dataset is mainly used for text classification: given the text, predict the sentiment.

  Languages

The reviews were mainly written in english.

  Dataset Structure





  Data Instances

A… See the full description on the dataset page: https://huggingface.co/datasets/Yelp/yelp_review_full.

Clear search

Close search

Google apps

Main menu

yelp_review_full

Processed YELP Dataset

Grepsr| Yelp Resturants Address and Reviews Data | Global Coverage with...

yelp open data philly restaurants

Yelp Dataset

Context

Content

Code snippet to read the files

YELP

Yelp Reviews

Yelp Inc. Alternative Data Analytics

yelp.com Traffic Analytics Data

yelp-04-2024

🏪Yelp Reviews for Senti-Analysis Binary -N/P+

Dataplex: Google Reviews & Ratings Dataset | Track Consumer Sentiment &...

Walmart reviews data in JSON format

Leveraging the Walmart Ratings and Reviews Dataset for Competitive Advantage

Unlocking User Sentiment: The App Store Reviews Dataset

Product and Price Data, Product Reviews Data from Google Shopping |...

yelp+amazon+imdb REVIEWS

Replication Data for: "Authentic and amazing": authenticity as an evaluative...

Replication Data for: \"A Topic-based Segmentation Model for Identifying...

Datasets for Sentiment Analysis

IMDB and Yelp datasets - Dataset - LDM

yelp_review_full

YelpReviewFull

Yelp/yelp_review_full