100+ datasets found
  1. u

    Amazon review data 2018

    • cseweb.ucsd.edu
    • nijianmo.github.io
    • +1more
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UCSD CSE Research Project, Amazon review data 2018 [Dataset]. https://cseweb.ucsd.edu/~jmcauley/datasets/amazon_v2/
    Explore at:
    Dataset authored and provided by
    UCSD CSE Research Project
    Description

    Context

    This Dataset is an updated version of the Amazon review dataset released in 2014. As in the previous version, this dataset includes reviews (ratings, text, helpfulness votes), product metadata (descriptions, category information, price, brand, and image features), and links (also viewed/also bought graphs). In addition, this version provides the following features:

    • More reviews:

      • The total number of reviews is 233.1 million (142.8 million in 2014).
    • New reviews:

      • Current data includes reviews in the range May 1996 - Oct 2018.
    • Metadata: - We have added transaction metadata for each review shown on the review page.

      • Added more detailed metadata of the product landing page.

    Acknowledgements

    If you publish articles based on this dataset, please cite the following paper:

    • Jianmo Ni, Jiacheng Li, Julian McAuley. Justifying recommendations using distantly-labeled reviews and fined-grained aspects. EMNLP, 2019.
  2. h

    Amazon-Reviews-2023

    • huggingface.co
    Updated Sep 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    McAuley-Lab (2023). Amazon-Reviews-2023 [Dataset]. https://huggingface.co/datasets/McAuley-Lab/Amazon-Reviews-2023
    Explore at:
    Dataset updated
    Sep 15, 2023
    Dataset authored and provided by
    McAuley-Lab
    Description

    Amazon Review 2023 is an updated version of the Amazon Review 2018 dataset. This dataset mainly includes reviews (ratings, text) and item metadata (desc- riptions, category information, price, brand, and images). Compared to the pre- vious versions, the 2023 version features larger size, newer reviews (up to Sep 2023), richer and cleaner meta data, and finer-grained timestamps (from day to milli-second).

  3. P

    Amazon Review Dataset

    • paperswithcode.com
    Updated Apr 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Amazon Review Dataset [Dataset]. https://paperswithcode.com/dataset/amazon-review
    Explore at:
    Dataset updated
    Apr 9, 2023
    Description

    Amazon Review is a dataset to tackle the task of identifying whether the sentiment of a product review is positive or negative. This dataset includes reviews from four different merchandise categories: Books (B) (2834 samples), DVDs (D) (1199 samples), Electronics (E) (1883 samples), and Kitchen and housewares (K) (1755 samples).

  4. Data from: The Multilingual Amazon Reviews Corpus

    • registry.opendata.aws
    Updated May 28, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amazon (2020). The Multilingual Amazon Reviews Corpus [Dataset]. https://registry.opendata.aws/amazon-reviews-ml/
    Explore at:
    Dataset updated
    May 28, 2020
    Dataset provided by
    Amazon.comhttp://amazon.com/
    Description

    We present a collection of Amazon reviews specifically designed to aid research in multilingual text classification. The dataset contains reviews in English, Japanese, German, French, Chinese and Spanish, collected between November 1, 2015 and November 1, 2019. Each record in the dataset contains the review text, the review title, the star rating, an anonymized reviewer ID, an anonymized product ID and the coarse-grained product category (e.g. 'books', 'appliances', etc.)

  5. b

    Amazon reviews Dataset

    • brightdata.com
    .json, .csv, .xlsx
    Updated Mar 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2023). Amazon reviews Dataset [Dataset]. https://brightdata.com/products/datasets/amazon/reviews
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Mar 21, 2023
    Dataset authored and provided by
    Bright Data
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    Utilize our Amazon reviews dataset for diverse applications to enrich business strategies and market insights. Analyzing this dataset can aid in understanding customer behavior, product performance, and market trends, empowering organizations to refine their product and marketing strategies. Access the entire dataset or tailor a subset to fit your requirements. Popular use cases include: Product Performance Analysis: Analyze Amazon reviews to assess product performance, uncovering customer satisfaction levels, common issues, and highly praised features to inform product improvements and marketing messages. Customer Behavior Insights: Gain insights into customer behavior, purchasing patterns, and preferences, enabling more personalized marketing and product recommendations. Demand Forecasting: Leverage Amazon reviews to predict future product demand by analyzing historical review data and identifying trends, helping to optimize inventory management and sales strategies. Accessing and analyzing the Amazon reviews dataset supports market strategy optimization by leveraging insights to analyze key market trends and customer preferences, enhancing overall business decision-making.

  6. h

    amazon_us_reviews

    • huggingface.co
    • tensorflow.org
    Updated Jun 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Polina Kazakova (2023). amazon_us_reviews [Dataset]. https://huggingface.co/datasets/polinaeterna/amazon_us_reviews
    Explore at:
    Dataset updated
    Jun 30, 2023
    Authors
    Polina Kazakova
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    Amazon Customer Reviews (a.k.a. Product Reviews) is one of Amazons iconic products. In a period of over two decades since the first review in 1995, millions of Amazon customers have contributed over a hundred million reviews to express opinions and describe their experiences regarding products on the Amazon.com website. This makes Amazon Customer Reviews a rich source of information for academic researchers in the fields of Natural Language Processing (NLP), Information Retrieval (IR), and Machine Learning (ML), amongst others. Accordingly, we are releasing this data to further research in multiple disciplines related to understanding customer product experiences. Specifically, this dataset was constructed to represent a sample of customer evaluations and opinions, variation in the perception of a product across geographical regions, and promotional intent or bias in reviews.

    Over 130+ million customer reviews are available to researchers as part of this release. The data is available in TSV files in the amazon-reviews-pds S3 bucket in AWS US East Region. Each line in the data files corresponds to an individual review (tab delimited, with no quote and escape characters).

    Each Dataset contains the following columns:

    • marketplace: 2 letter country code of the marketplace where the review was written.
    • customer_id: Random identifier that can be used to aggregate reviews written by a single author.
    • review_id: The unique ID of the review.
    • product_id: The unique Product ID the review pertains to. In the multilingual dataset the reviews for the same product in different countries can be grouped by the same product_id.
    • product_parent: Random identifier that can be used to aggregate reviews for the same product.
    • product_title: Title of the product.
    • product_category: Broad product category that can be used to group reviews (also used to group the dataset into coherent parts).
    • star_rating: The 1-5 star rating of the review.
    • helpful_votes: Number of helpful votes.
    • total_votes: Number of total votes the review received.
    • vine: Review was written as part of the Vine program.
    • verified_purchase: The review is on a verified purchase.
    • review_headline: The title of the review.
    • review_body: The review text.
    • review_date: The date the review was written.
  7. h

    amazon-review-description

    • huggingface.co
    Updated Oct 21, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TPP-LLM (2024). amazon-review-description [Dataset]. https://huggingface.co/datasets/tppllm/amazon-review-description
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 21, 2024
    Dataset authored and provided by
    TPP-LLM
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    Amazon Review Description Dataset

    This dataset contains Amazon reviews from January 1, 2018, to June 30, 2018. It includes 2,245 sequences with 127,054 events across 18 category types. The original data is available at Amazon Review Data with citation information provided on the page. The detailed data preprocessing steps used to create this dataset can be found in the TPP-LLM paper and TPP-LLM-Embedding paper. If you find this dataset useful, we kindly invite you to cite the… See the full description on the dataset page: https://huggingface.co/datasets/tppllm/amazon-review-description.

  8. f

    Amazon Reviews Full

    • figshare.com
    application/x-gzip
    Updated Nov 13, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luís Fred (2020). Amazon Reviews Full [Dataset]. http://doi.org/10.6084/m9.figshare.13232537.v1
    Explore at:
    application/x-gzipAvailable download formats
    Dataset updated
    Nov 13, 2020
    Dataset provided by
    figshare
    Authors
    Luís Fred
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Amazon Review Full Score DatasetVersion 3, Updated 09/09/2015ORIGINThe Amazon reviews dataset consists of reviews from amazon. The data span a period of 18 years, including ~35 million reviews up to March 2013. Reviews include product and user information, ratings, and a plaintext review. For more information, please refer to the following paper: J. McAuley and J. Leskovec. Hidden factors and hidden topics: understanding rating dimensions with review text. RecSys, 2013.The Amazon reviews full score dataset is constructed by Xiang Zhang (xiang.zhang@nyu.edu) from the above dataset. It is used as a text classification benchmark in the following paper: Xiang Zhang, Junbo Zhao, Yann LeCun. Character-level Convolutional Networks for Text Classification. Advances in Neural Information Processing Systems 28 (NIPS 2015).DESCRIPTIONThe Amazon reviews full score dataset is constructed by randomly taking 600,000 training samples and 130,000 testing samples for each review score from 1 to 5. In total there are 3,000,000 trainig samples and 650,000 testing samples.The files train.csv and test.csv contain all the training samples as comma-sparated values. There are 3 columns in them, corresponding to class index (1 to 5), review title and review text. The review title and text are escaped using double quotes ("), and any internal double quote is escaped by 2 double quotes (""). New lines are escaped by a backslash followed with an "n" character, that is " ".

  9. d

    DATAANT | Amazon Data | E-commerce Product Review | Dataset, API | Reviews...

    • datarade.ai
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataant, DATAANT | Amazon Data | E-commerce Product Review | Dataset, API | Reviews by keyword, by category, by seller, by product ASIN | 19 countries [Dataset]. https://datarade.ai/data-products/amazon-data-reviews-by-keyword-by-category-by-seller-by-p-dataant
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sqlAvailable download formats
    Dataset authored and provided by
    Dataant
    Area covered
    Spain, Turkey, Canada, Netherlands, France, Brazil, United Arab Emirates, China, Poland, Germany
    Description

    Get the needed Amazon product review data right from the data extractor! Collect Amazon review information from 19 Amazon countries from the following domains: - amazon.com - amazon.com.au - amazon.com.br - amazon.ca - amazon.cn - amazon.fr - amazon.de - amazon.in - amazon.it - amazon.com.mx - amazon.nl - amazon.sg - amazon.es - amazon.com.tr

    Request Ecommerce Product Review dataset by: - keyword - category - seller - product ID (ASIN)

    Amazon E-commerce Reviews Data datasets gathered by keyword, seller, category, or ASIN contain: - Product ID (can be extended to the full product information) - Review content and rating - Review metadata

    Amazon extraction results can be delivered by schedule or API request, so the data can be extracted in real-time.

    DATAANT uses the in-house web scraping service with no concurrency limitations, so unlimited data extractions can be performed simultaneously.

    Output can and attributes can be customized to fit your particular needs.

  10. h

    amazon-reviews

    • huggingface.co
    Updated Apr 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sentence Transformers (2025). amazon-reviews [Dataset]. https://huggingface.co/datasets/sentence-transformers/amazon-reviews
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 7, 2025
    Dataset authored and provided by
    Sentence Transformers
    Description

    Dataset Card for Amazon Reviews 2018

    This dataset is a collection of title-review pairs collected from Amazon, as collected in Ni et al.. See Amazon Reviews 2018 for additional information. This dataset can be used directly with Sentence Transformers to train embedding models.

      Dataset Subsets
    
    
    
    
    
      pair subset
    

    Columns: "title", "review" Column types: str, str Examples:{ 'title': "It doesn't fit my machine. I can't seem to ...", 'review': "It doesn't fit my… See the full description on the dataset page: https://huggingface.co/datasets/sentence-transformers/amazon-reviews.

  11. A

    ‘Amazon Product Reviews Dataset’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Feb 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Amazon Product Reviews Dataset’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-amazon-product-reviews-dataset-7933/latest
    Explore at:
    Dataset updated
    Feb 13, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Amazon Product Reviews Dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/yamqwe/amazon-product-reviews-datasete on 13 February 2022.

    --- Dataset description provided by original source is as follows ---

    About this dataset

    This dataset contains 30K records of product reviews from amazon.com.

    This dataset was created by PromptCloud and DataStock

    Content

    This dataset contains the following:

    • Total Records Count: 43729

    • Domain Name: amazon.com

    • Date Range: 01st Jan 2020 - 31st Mar 2020

    • File Extension: CSV

    • Available Fields:
      -- Uniq Id,
      -- Crawl Timestamp,
      -- Billing Uniq Id,
      -- Rating,
      -- Review Title,
      -- Review Rating,
      -- Review Date,
      -- User Id,
      -- Brand,
      -- Category,
      -- Sub Category,
      -- Product Description,
      -- Asin,
      -- Url,
      -- Review Content,
      -- Verified Purchase,
      -- Helpful Review Count,
      -- Manufacturer Response

    Acknowledgements

    We wouldn't be here without the help of our in house teams at PromptCloud and DataStock. Who has put their heart and soul into this project like all other projects? We want to provide the best quality data and we will continue to do so.

    Inspiration

    The inspiration for these datasets came from research. Reviews are something that is important wit everybody across the globe. So we decided to come up with this dataset that shows us exactly how the user reviews help companies to better their products.

    This dataset was created by PromptCloud and contains around 0 samples along with Billing Uniq Id, Verified Purchase, technical information and other features such as: - Crawl Timestamp - Manufacturer Response - and more.

    How to use this dataset

    • Analyze Helpful Review Count in relation to Sub Category
    • Study the influence of Review Date on Product Description
    • More datasets

    Acknowledgements

    If you use this dataset in your research, please credit PromptCloud

    Start A New Notebook!

    --- Original source retains full ownership of the source dataset ---

  12. amazon-reviews-sentiment-analysis

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    fastai X Hugging Face Group 2022, amazon-reviews-sentiment-analysis [Dataset]. https://huggingface.co/datasets/hugginglearners/amazon-reviews-sentiment-analysis
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset provided by
    Hugging Facehttps://huggingface.co/
    Authors
    fastai X Hugging Face Group 2022
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Dataset Card for amazon reviews for sentiment analysis

      Dataset Summary
    

    One of the most important problems in e-commerce is the correct calculation of the points given to after-sales products. The solution to this problem is to provide greater customer satisfaction for the e-commerce site, product prominence for sellers, and a seamless shopping experience for buyers. Another problem is the correct ordering of the comments given to the products. The prominence of misleading… See the full description on the dataset page: https://huggingface.co/datasets/hugginglearners/amazon-reviews-sentiment-analysis.

  13. Amazon Fine Food Reviews

    • kaggle.com
    zip
    Updated May 1, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stanford Network Analysis Project (2017). Amazon Fine Food Reviews [Dataset]. https://www.kaggle.com/snap/Amazon-fine-food-reviews
    Explore at:
    zip(253873708 bytes)Available download formats
    Dataset updated
    May 1, 2017
    Dataset authored and provided by
    Stanford Network Analysis Project
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    This dataset consists of reviews of fine foods from amazon. The data span a period of more than 10 years, including all ~500,000 reviews up to October 2012. Reviews include product and user information, ratings, and a plain text review. It also includes reviews from all other Amazon categories.

    Contents

    • Reviews.csv: Pulled from the corresponding SQLite table named Reviews in database.sqlite
    • database.sqlite: Contains the table 'Reviews'

    Data includes:
    - Reviews from Oct 1999 - Oct 2012
    - 568,454 reviews
    - 256,059 users
    - 74,258 products
    - 260 users with > 50 reviews

    wordcloud

    Acknowledgements

    See this SQLite query for a quick sample of the dataset.

    If you publish articles based on this dataset, please cite the following paper:

  14. a

    Amazon reviews - Full

    • academictorrents.com
    bittorrent
    Updated Oct 16, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xiang Zhang et al., 2015 (2018). Amazon reviews - Full [Dataset]. https://academictorrents.com/details/66ddbb6d5f49aa6c36a01ca5e814f1beef00b5b7
    Explore at:
    bittorrent(643695014)Available download formats
    Dataset updated
    Oct 16, 2018
    Dataset authored and provided by
    Xiang Zhang et al., 2015
    License

    https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified

    Description

    34,686,770 Amazon reviews from 6,643,669 users on 2,441,053 products, from the Stanford Network Analysis Project (SNAP). This full dataset contains 600,000 training samples and 130,000 testing samples in each class.

  15. a

    Amazon reviews - Polarity

    • academictorrents.com
    bittorrent
    Updated Oct 16, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xiang Zhang et al., 2015 (2018). Amazon reviews - Polarity [Dataset]. https://academictorrents.com/details/db0cd5603a0d154ec3dcfc6ff7862d47d3884b83
    Explore at:
    bittorrent(688339454)Available download formats
    Dataset updated
    Oct 16, 2018
    Dataset authored and provided by
    Xiang Zhang et al., 2015
    License

    https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified

    Description

    34,686,770 Amazon reviews from 6,643,669 users on 2,441,053 products, from the Stanford Network Analysis Project (SNAP). This subset contains 1,800,000 training samples and 200,000 testing samples in each polarity sentiment.

  16. U.S. Amazon shopper product review trust 2020

    • statista.com
    Updated Mar 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). U.S. Amazon shopper product review trust 2020 [Dataset]. https://www.statista.com/statistics/623659/amazon-customer-review-usage-usa/
    Explore at:
    Dataset updated
    Mar 24, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jan 30, 2020 - Feb 11, 2020
    Area covered
    United States
    Description

    This statistic presents the share of Amazon shoppers in the United States who trust product reviews as of February 2020. During the survey period, 24.6 percent of survey respondents stated that they only trusted reviews from Verified Purchasers.

  17. c

    Amazon UK shoes products reviews dataset

    • crawlfeeds.com
    csv, zip
    Updated Jun 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). Amazon UK shoes products reviews dataset [Dataset]. https://crawlfeeds.com/datasets/amazon-uk-shoes-products-reviews-dataset
    Explore at:
    csv, zipAvailable download formats
    Dataset updated
    Jun 27, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    Unlock detailed insights with our Amazon UK Shoes Products Reviews Dataset, an invaluable resource for businesses, researchers, and data analysts. This dataset features comprehensive information, including product names, review texts, star ratings, and customer feedback for a wide range of shoe products available on Amazon UK.

    Key Features:

    • Extensive Coverage: Includes detailed reviews and ratings for various shoe products, helping you analyze customer preferences and trends.

    • Structured Data: Available in easily accessible formats like product review dataset CSV, making it perfect for integration into your analytical workflows.

    • Actionable Insights: Leverage this dataset for customer sentiment analysis, product optimization, and competitive benchmarking.

    Why Choose the Amazon UK Shoes Products Reviews Dataset?

    Whether you're delving into customer behavior, conducting market research, or improving product offerings, this dataset empowers you to make informed decisions. By working with a dataset enriched with real-world feedback, you can:

    • Understand customer preferences: Dive into detailed reviews to uncover patterns in consumer likes and dislikes.

    • Enhance product offerings: Identify gaps and opportunities in the market to better meet customer demands.

    • Boost competitive analysis: Compare customer feedback across different brands and products.

    Additional Datasets Available

    Explore related datasets like the Amazon product review dataset, offering insights across various categories and regions. For specific needs, our curated product reviews dataset is tailored to help you gain a granular understanding of niche markets.

  18. P

    Amazon-Fraud Dataset

    • paperswithcode.com
    Updated Dec 23, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yingtong Dou; Zhiwei Liu; Li Sun; Yutong Deng; Hao Peng; Philip S. Yu (2024). Amazon-Fraud Dataset [Dataset]. https://paperswithcode.com/dataset/amazon-fraud
    Explore at:
    Dataset updated
    Dec 23, 2024
    Authors
    Yingtong Dou; Zhiwei Liu; Li Sun; Yutong Deng; Hao Peng; Philip S. Yu
    Description

    Amazon-Fraud is a multi-relational graph dataset built upon the Amazon review dataset, which can be used in evaluating graph-based node classification, fraud detection, and anomaly detection models.

    Dataset Statistics

    # Nodes%Fraud Nodes (Class=1)
    11,9449.5
    Relation# Edges
    U-P-U
    U-S-U
    U-V-U1,036,737
    All

    Graph Construction

    The Amazon dataset includes product reviews under the Musical Instruments category. Similar to this paper, we label users with more than 80% helpful votes as benign entities and users with less than 20% helpful votes as fraudulent entities. we conduct a fraudulent user detection task on the Amazon-Fraud dataset, which is a binary classification task. We take 25 handcrafted features from this paper as the raw node features for Amazon-Fraud. We take users as nodes in the graph and design three relations: 1) U-P-U: it connects users reviewing at least one same product; 2) U-S-V: it connects users having at least one same star rating within one week; 3) U-V-U: it connects users with top 5% mutual review text similarities (measured by TF-IDF) among all users.

    To download the dataset, please visit this Github repo. For any other questions, please email ytongdou(AT)gmail.com for inquiry.

  19. Amazon US Customer Reviews Dataset

    • kaggle.com
    Updated Jun 16, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cynthia Rempel (2021). Amazon US Customer Reviews Dataset [Dataset]. https://www.kaggle.com/datasets/cynthiarempel/amazon-us-customer-reviews-dataset/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 16, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Cynthia Rempel
    Description

    Amazon Customer Reviews Dataset

    Amazon Customer Reviews (a.k.a. Product Reviews) is one of Amazon’s iconic products. In a period of over two decades since the first review in 1995, millions of Amazon customers have contributed over a hundred million reviews to express opinions and describe their experiences regarding products on the Amazon.com website. This makes Amazon Customer Reviews a rich source of information for academic researchers in the fields of Natural Language Processing (NLP), Information Retrieval (IR), and Machine Learning (ML), amongst others. Accordingly, we are releasing this data to further research in multiple disciplines related to understanding customer product experiences. Specifically, this dataset was constructed to represent a sample of customer evaluations and opinions, variation in the perception of a product across geographical regions, and promotional intent or bias in reviews.

    Content

    • marketplace
      • 2 letter country code of the marketplace where the review was written.
    • customer_id
      • Random identifier that can be used to aggregate reviews written by a single author.
    • review_id - The unique ID of the review.
    • product_id - The unique Product ID the review pertains to. In the multilingual dataset the reviews for the same product in different countries can be grouped by the same product_id.
    • product_parent
      • Random identifier that can be used to aggregate reviews for the same product.
    • product_title
      • Title of the product.
    • product_category
      • Broad product category that can be used to group reviews
      • (also used to group the dataset into coherent parts).
    • star_rating
      • The 1-5 star rating of the review.
    • helpful_votes
      • Number of helpful votes.
    • total_votes
      • Number of total votes the review received.
    • vine - Review was written as part of the Vine program.
    • verified_purchase
      • The review is on a verified purchase.
    • review_headline
      • The title of the review.
    • review_body
      • The review text.
    • review_date
      • The date the review was written.

    License

    By accessing the Amazon Customer Reviews Library ("Reviews Library"), you agree that the Reviews Library is an Amazon Service subject to the Amazon.com Conditions of Use (https://www.amazon.com/gp/help/customer/display.html/ref=footer_cou?ie=UTF8&nodeId=508088) and you agree to be bound by them, with the following additional conditions:

    In addition to the license rights granted under the Conditions of Use, Amazon or its content providers grant you a limited, non-exclusive, non-transferable, non-sublicensable, revocable license to access and use the Reviews Library for purposes of academic research. You may not resell, republish, or make any commercial use of the Reviews Library or its contents, including use of the Reviews Library for commercial research, such as research related to a funding or consultancy contract, internship, or other relationship in which the results are provided for a fee or delivered to a for-profit organization. You may not (a) link or associate content in the Reviews Library with any personal information (including Amazon customer accounts), or (b) attempt to determine the identity of the author of any content in the Reviews Library. If you violate any of the foregoing conditions, your license to access and use the Reviews Library will automatically terminate without prejudice to any of the other rights or remedies Amazon may have. https://s3.amazonaws.com/amazon-reviews-pds/license.txt

    Acknowledgements

    Provided by Amazon... https://s3.amazonaws.com/amazon-reviews-pds/readme.html

    Inspiration

    What kinds of questions can be answered by the amazon us customer dataset?

  20. P

    Amazon Polarity Dataset

    • paperswithcode.com
    Updated May 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Amazon Polarity Dataset [Dataset]. https://paperswithcode.com/dataset/amazon-polarity-1
    Explore at:
    Dataset updated
    May 15, 2024
    Description

    The Amazon Polarity dataset is a set of reviews from Amazon. The dataset is constructed by taking review scores 1 and 2 as negative (class 1), and 4 and 5 as positive (class 2). Reviews with a score of 3 are ignored. The dataset spans a period of 18 years, including approximately 35 million reviews up to March 2013. Each class in the dataset has 1,800,000 training samples and 200,000 testing samples. The dataset includes product and user information, ratings, and a plaintext review.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
UCSD CSE Research Project, Amazon review data 2018 [Dataset]. https://cseweb.ucsd.edu/~jmcauley/datasets/amazon_v2/

Amazon review data 2018

Explore at:
79 scholarly articles cite this dataset (View in Google Scholar)
Dataset authored and provided by
UCSD CSE Research Project
Description

Context

This Dataset is an updated version of the Amazon review dataset released in 2014. As in the previous version, this dataset includes reviews (ratings, text, helpfulness votes), product metadata (descriptions, category information, price, brand, and image features), and links (also viewed/also bought graphs). In addition, this version provides the following features:

  • More reviews:

    • The total number of reviews is 233.1 million (142.8 million in 2014).
  • New reviews:

    • Current data includes reviews in the range May 1996 - Oct 2018.
  • Metadata: - We have added transaction metadata for each review shown on the review page.

    • Added more detailed metadata of the product landing page.

Acknowledgements

If you publish articles based on this dataset, please cite the following paper:

  • Jianmo Ni, Jiacheng Li, Julian McAuley. Justifying recommendations using distantly-labeled reviews and fined-grained aspects. EMNLP, 2019.
Search
Clear search
Close search
Google apps
Main menu