According to a global 2021 report, out of 2.7 million online fake reviews that were detected and removed, 46 percent were five-star reviews. Additionally, almost a third of all fake reviews that were removed were one-star reviews, and eight percent were two-star.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Abstract: With the e-commerce growth, more people are buying products over the internet. To increase customer satisfaction, merchants provide spaces for product and service reviews. Products with positive reviews attract customers, while products with negative reviews lose customers. Following this idea, some individuals and corporations write fake reviews to promote their products and services or defame their competitors. The difficulty for finding these reviews was in the large amount of information available. One solution is to use data mining techniques and tools, such as the classification function. Exploring this situation, the present work evaluates classification techniques to identify fake reviews about products and services on the Internet. The research also presents a literature systematic review on fake reviews. The research used 8 classification algorithms. The algorithms were trained and tested with a hotels database. The CONCENSO algorithm presented the best result, with 88% in the precision indicator. After the first test, the algorithms classified reviews on another hotels database. To compare the results of this new classification, the Review Skeptic algorithm was used. The SVM and GLMNET algorithms presented the highest convergence with the Review Skeptic algorithm, classifying 83% of reviews with the same result. The research contributes by demonstrating the algorithms ability to understand consumers’ real reviews to products and services on the Internet. Another contribution is to be the pioneer in the investigation of fake reviews in Brazil and in production engineering.
In 2024, ** percent of U.S. consumers answering a survey were confident in having seen fake product reviews on Amazon. Although the number might seem very high, the figure has decreased compared to 2023, when ** percent of respondents stated the same.
According to a January 2024 analysis, the share of travel reviews on Tripadvisor deemed to be fake is predicted to double in 2023 compared to the previous year. As forecast, fake reviews are expected to account for an estimated 8.8 percent of all reviews on the travel website that year. This figure would represent a 267 percentage increase compared to 2018.
The generated fake reviews dataset, containing 20k fake reviews and 20k real product reviews. OR = Original reviews (presumably human created and authentic); CG = Computer-generated fake reviews.
Citation Salminen, J., Kandpal, C., Kamel, A. M., Jung, S., & Jansen, B. J. (2022). Creating and detecting fake reviews of online products. Journal of Retailing and Consumer Services, 64, 102771. https://doi.org/10.1016/j.jretconser.2021.102771
Acknowlegement Foto von Brett Jordan auf Unsplash
Original Data Source: 🚨 Fake Reviews Dataset
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
This is a binary dataset used for Bengali fake review detection in the paper "Bengali Fake Reviews: A Benchmark Dataset and Detection System" accepted in Neurocomputing, a journal published by Elsevier.
Annotated by 4 native Bangla speakers with more than 90% trustworthiness score.
Fleiss' Kappa Score: 0.83
Number of Taotal Data
Fake - 1339 Non-fake - 7710
Class wise statistics of BFRD dataset
Statistics Fake Non-fake
Total words 1,55,789 9,27,902
Total… See the full description on the dataset page: https://huggingface.co/datasets/shawon95/Bengali-Fake-Review-Dataset.
This statistic displays the results of a survey conducted in 2017 about the perception that fake product reviews have become a norm in the e-commerce industry according to consumers in India. During the survey period, 72 percent of respondents agreed that fake product reviews have become a norm in the e-commerce industry.
This dataset contains anonymised data of accounts and reviews, labelled as fake/real collected through scraping of Google Maps. It is a part of the research described under this link.
Please cite the article describing this dataset as: P. Gryka and A. Janicki, “Detecting Fake Reviews in Google Maps—A Case Study,” Applied Sciences, vol. 13, no. 10, p. 6331, May 2023, doi: 10.3390/app13106331.
CC By 4.0
Original Data Source: GMR-PL Fake reviews dataset
This Dataset is an updated version of the Amazon review dataset released in 2014. As in the previous version, this dataset includes reviews (ratings, text, helpfulness votes), product metadata (descriptions, category information, price, brand, and image features), and links (also viewed/also bought graphs). In addition, this version provides the following features:
More reviews:
New reviews:
Metadata: - We have added transaction metadata for each review shown on the review page.
If you publish articles based on this dataset, please cite the following paper:
https://brightdata.com/licensehttps://brightdata.com/license
Utilize our Amazon reviews dataset for diverse applications to enrich business strategies and market insights. Analyzing this dataset can aid in understanding customer behavior, product performance, and market trends, empowering organizations to refine their product and marketing strategies. Access the entire dataset or tailor a subset to fit your requirements. Popular use cases include: Product Performance Analysis: Analyze Amazon reviews to assess product performance, uncovering customer satisfaction levels, common issues, and highly praised features to inform product improvements and marketing messages. Customer Behavior Insights: Gain insights into customer behavior, purchasing patterns, and preferences, enabling more personalized marketing and product recommendations. Demand Forecasting: Leverage Amazon reviews to predict future product demand by analyzing historical review data and identifying trends, helping to optimize inventory management and sales strategies. Accessing and analyzing the Amazon reviews dataset supports market strategy optimization by leveraging insights to analyze key market trends and customer preferences, enhancing overall business decision-making.
Yelp-Fraud is a multi-relational graph dataset built upon the Yelp spam review dataset, which can be used in evaluating graph-based node classification, fraud detection, and anomaly detection models.
Dataset Statistics
# Nodes | %Fraud Nodes (Class=1) |
---|---|
45,954 | 14.5 |
Relation | # Edges |
---|---|
R-U-R | |
R-T-R | |
R-S-R | 3,402,743 |
All |
Graph Construction
The Yelp spam review dataset includes hotel and restaurant reviews filtered (spam) and recommended (legitimate) by Yelp. We conduct a spam review detection task on the Yelp-Fraud dataset which is a binary classification task. We take 32 handcrafted features from SpEagle paper as the raw node features for Yelp-Fraud. Based on previous studies which show that opinion fraudsters have connections in user, product, review text, and time, we take reviews as nodes in the graph and design three relations: 1) R-U-R: it connects reviews posted by the same user; 2) R-S-R: it connects reviews under the same product with the same star rating (1-5 stars); 3) R-T-R: it connects two reviews under the same product posted in the same month.
To download the dataset, please visit this Github repo. For any other questions, please email ytongdou(AT)gmail.com for inquiry.
Amazon Review 2023 is an updated version of the Amazon Review 2018 dataset. This dataset mainly includes reviews (ratings, text) and item metadata (desc- riptions, category information, price, brand, and images). Compared to the pre- vious versions, the 2023 version features larger size, newer reviews (up to Sep 2023), richer and cleaner meta data, and finer-grained timestamps (from day to milli-second).
In 2021, Google's share of online reviews increased to 71 percent, up from 67 percent in 2020, indicating a rise in willingness from consumers to share their experiences and opinions online. Overall, Google is the platform and search engine on which most consumers leave reviews for local businesses.
OpenWeb Ninja's Product Data API provides Product Data, Product Reviews Data, Product Offers, sourced in real-time from Google Shopping - the largest product listings aggregate on the web, listing products from all publicly available e-commerce sites (Amazon, eBay, Walmart + many others).
The API covers more than 35 billion Product Data Listings, including Product Reviews and Product Offers across the web. The API provides over 40 product data points including prices, rating and reviews insights, product details and specs, typical price ranges, and more.
OpenWeb Ninja's Product Data common use cases: - Price Optimization & Price Comparison - Market Research & Competitive Analysis - Product Research & Trend Analysis - Customer Reviews Analysis
OpenWeb Ninja's Product Data Stats & Capabilities: - 35B+ Product Listings - 40+ data points per job listing - Global aggregate - Search by keyword or GTIN/EAN
Simple pricing, pay per successful result only. Say goodbye to being charged for failed requests.
Filter results by number of reviews, date
Review data includes meta data about customers such as avatar, location, profile url, etc.
Get page meta data like product price information, rating distribution, etc.
We present a collection of Amazon reviews specifically designed to aid research in multilingual text classification. The dataset contains reviews in English, Japanese, German, French, Chinese and Spanish, collected between November 1, 2015 and November 1, 2019. Each record in the dataset contains the review text, the review title, the star rating, an anonymized reviewer ID, an anonymized product ID and the coarse-grained product category (e.g. 'books', 'appliances', etc.)
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
This is a small subset of dataset of Book reviews from Amazon Kindle Store category.
5-core dataset of product reviews from Amazon Kindle Store category from May 1996 - July 2014. Contains total of 982619 entries. Each reviewer has at least 5 reviews and each product has at least 5 reviews in this dataset. Columns - asin - ID of the product, like B000FA64PK -helpful - helpfulness rating of the review - example: 2/3. -overall - rating of the product. -reviewText - text of the review (heading). -reviewTime - time of the review (raw). -reviewerID - ID of the reviewer, like A3SPTOKDG7WBLN -reviewerName - name of the reviewer. -summary - summary of the review (description). -unixReviewTime - unix timestamp.
There are two files one is preprocessed ready for sentiment analysis and other is unprocessed to you basically have to process the dataset and then perform sentiment analysis
This dataset is taken from Amazon product data, Julian McAuley, UCSD website. http://jmcauley.ucsd.edu/data/amazon/
License to the data files belong to them.
-Sentiment analysis on reviews. -Understanding how people rate usefulness of a review/ What factors influence helpfulness of a review. -Fake reviews/ outliers. -Best rated product IDs, or similarity between products based on reviews alone (not the best idea ikr). -Any other interesting analysis
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset supports the article entitled "Effects of Customer Reviews on Product Sales of Strong Brands: A Qualitative Comparative Analysis."
https://brightdata.com/licensehttps://brightdata.com/license
The Google Reviews dataset is perfect for obtaining comprehensive insights into businesses and their customer feedback globally. Easily filter by location, business type, or reviewer details to extract the precise data you need. The Google Reviews dataset includes key data points such as URL, place ID, place name, country, address, review ID, reviewer name, total reviews and photos by the reviewer, reviewer profile URL, and more. This dataset provides valuable information for sentiment analysis, business comparisons, and customer behavior studies.
According to a survey conducted in January 2023, 33 percent of consumers in the United States reported to always read online reviews of local businesses. Over 43 percent said that they regularly read online reviews of local businesses and just two percent said that they never read online reviews.
According to a global 2021 report, out of 2.7 million online fake reviews that were detected and removed, 46 percent were five-star reviews. Additionally, almost a third of all fake reviews that were removed were one-star reviews, and eight percent were two-star.