93 datasets found

amazon-reviews-sentiment-analysis
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
fastai X Hugging Face Group 2022, amazon-reviews-sentiment-analysis [Dataset]. https://huggingface.co/datasets/hugginglearners/amazon-reviews-sentiment-analysis
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset provided by
Hugging Facehttps://huggingface.co/
Authors
fastai X Hugging Face Group 2022
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Dataset Card for amazon reviews for sentiment analysis

Dataset Summary

One of the most important problems in e-commerce is the correct calculation of the points given to after-sales products. The solution to this problem is to provide greater customer satisfaction for the e-commerce site, product prominence for sellers, and a seamless shopping experience for buyers. Another problem is the correct ordering of the comments given to the products. The prominence of misleading… See the full description on the dataset page: https://huggingface.co/datasets/hugginglearners/amazon-reviews-sentiment-analysis.
Datasets for Sentiment Analysis
zenodo.org
csv
Updated Dec 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Julie R. Repository creator - Campos Arias; Julie R. Repository creator - Campos Arias (2023). Datasets for Sentiment Analysis [Dataset]. http://doi.org/10.5281/zenodo.10157504
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10157504
Dataset updated
Dec 10, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Julie R. Repository creator - Campos Arias; Julie R. Repository creator - Campos Arias
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository was created for my Master's thesis in Computational Intelligence and Internet of Things at the University of Córdoba, Spain. The purpose of this repository is to store the datasets found that were used in some of the studies that served as research material for this Master's thesis. Also, the datasets used in the experimental part of this work are included.
Below are the datasets specified, along with the details of their references, authors, and download sources.

----------- STS-Gold Dataset ----------------
The dataset consists of 2026 tweets. The file consists of 3 columns: id, polarity, and tweet. The three columns denote the unique id, polarity index of the text and the tweet text respectively.
Reference: Saif, H., Fernandez, M., He, Y., & Alani, H. (2013). Evaluation datasets for Twitter sentiment analysis: a survey and a new dataset, the STS-Gold.
File name: sts_gold_tweet.csv
----------- Amazon Sales Dataset ----------------
This dataset is having the data of 1K+ Amazon Product's Ratings and Reviews as per their details listed on the official website of Amazon. The data was scraped in the month of January 2023 from the Official Website of Amazon.
Owner: Karkavelraja J., Postgraduate student at Puducherry Technological University (Puducherry, Puducherry, India)
Features:
product_id - Product ID
product_name - Name of the Product
category - Category of the Product
discounted_price - Discounted Price of the Product
actual_price - Actual Price of the Product
discount_percentage - Percentage of Discount for the Product
rating - Rating of the Product
rating_count - Number of people who voted for the Amazon rating
about_product - Description about the Product
user_id - ID of the user who wrote review for the Product
user_name - Name of the user who wrote review for the Product
review_id - ID of the user review
review_title - Short review
review_content - Long review
img_link - Image Link of the Product
product_link - Official Website Link of the Product
License: CC BY-NC-SA 4.0
File name: amazon.csv
----------- Rotten Tomatoes Reviews Dataset ----------------
This rating inference dataset is a sentiment classification dataset, containing 5,331 positive and 5,331 negative processed sentences from Rotten Tomatoes movie reviews. On average, these reviews consist of 21 words. The first 5331 rows contains only negative samples and the last 5331 rows contain only positive samples, thus the data should be shuffled before usage.
This data is collected from https://www.cs.cornell.edu/people/pabo/movie-review-data/ as a txt file and converted into a csv file. The file consists of 2 columns: reviews and labels (1 for fresh (good) and 0 for rotten (bad)).
Reference: Bo Pang and Lillian Lee. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05), pages 115–124, Ann Arbor, Michigan, June 2005. Association for Computational Linguistics
File name: data_rt.csv
----------- Preprocessed Dataset Sentiment Analysis ----------------
Preprocessed amazon product review data of Gen3EcoDot (Alexa) scrapped entirely from amazon.in
Stemmed and lemmatized using nltk.
Sentiment labels are generated using TextBlob polarity scores.
The file consists of 4 columns: index, review (stemmed and lemmatized review using nltk), polarity (score) and division (categorical label generated using polarity score).
DOI: 10.34740/kaggle/dsv/3877817
Citation: @misc{pradeesh arumadi_2022, title={Preprocessed Dataset Sentiment Analysis}, url={https://www.kaggle.com/dsv/3877817}, DOI={10.34740/KAGGLE/DSV/3877817}, publisher={Kaggle}, author={Pradeesh Arumadi}, year={2022} }
This dataset was used in the experimental phase of my research.
File name: EcoPreprocessed.csv
----------- Amazon Earphones Reviews ----------------
This dataset consists of a 9930 Amazon reviews, star ratings, for 10 latest (as of mid-2019) bluetooth earphone devices for learning how to train Machine for sentiment analysis.
This dataset was employed in the experimental phase of my research. To align it with the objectives of my study, certain reviews were excluded from the original dataset, and an additional column was incorporated into this dataset.
The file consists of 5 columns: ReviewTitle, ReviewBody, ReviewStar, Product and division (manually added - categorical label generated using ReviewStar score)
License: U.S. Government Works
Source: www.amazon.in
File name (original): AllProductReviews.csv (contains 14337 reviews)
File name (edited - used for my research) : AllProductReviews2.csv (contains 9930 reviews)
----------- Amazon Musical Instruments Reviews ----------------
This dataset contains 7137 comments/reviews of different musical instruments coming from Amazon.
This dataset was employed in the experimental phase of my research. To align it with the objectives of my study, certain reviews were excluded from the original dataset, and an additional column was incorporated into this dataset.
The file consists of 10 columns: reviewerID, asin (ID of the product), reviewerName, helpful (helpfulness rating of the review), reviewText, overall (rating of the product), summary (summary of the review), unixReviewTime (time of the review - unix time), reviewTime (time of the review (raw) and division (manually added - categorical label generated using overall score).
Source: http://jmcauley.ucsd.edu/data/amazon/
File name (original): Musical_instruments_reviews.csv (contains 10261 reviews)
File name (edited - used for my research) : Musical_instruments_reviews2.csv (contains 7137 reviews)
Amazon Reviews Dataset
kaggle.com
Updated Sep 20, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dongre Laxman (2024). Amazon Reviews Dataset [Dataset]. https://www.kaggle.com/datasets/dongrelaxman/amazon-reviews-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 20, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Dongre Laxman
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
This dataset comprises customer reviews for Amazon, an online retail giant, featuring insights into customer experiences, including ratings, review titles, texts, and metadata. It is valuable for analyzing customer satisfaction, sentiment, and trends.

Column Descriptions:

Reviewer Name: Identifies the reviewer. Profile Link: Links to the reviewer's profile for additional insights. Country: Indicates the reviewer's location. Review Count: Number of reviews by the same user, showing engagement level. Review Date: When the review was posted, useful for time analysis. Rating: Numerical satisfaction measure. Review Title: Summarizes the review sentiment. Review Text: Detailed customer feedback. Date of Experience: When the service/product was experienced.

Prospective applications:

Sentiment Analysis: Analyze review texts and titles to assess overall customer sentiment toward products, enabling the identification of strengths and weaknesses. Customer Satisfaction Tracking: Track and visualize rating trends over time to understand fluctuations in customer satisfaction. Product Improvement: Identify common themes in reviews to highlight areas for product enhancement or development. Market Segmentation: Use country and demographic information to customize marketing strategies and gain insights into regional preferences. Competitor Analysis: Evaluate customer feedback on Amazon products in comparison to competitors to determine market positioning. Recommendation Systems: Leverage review data to enhance recommendation algorithms, improving personalized shopping experiences. Trend Analysis: Investigate temporal patterns in reviews to link sentiment changes with marketing efforts or product launches.

This extensive dataset serves as a valuable asset for various analyses focused on enhancing customer engagement and refining business strategies.
Amazon Reviews Dataset
kaggle.com
Updated Jan 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Ihenacho (2023). Amazon Reviews Dataset [Dataset]. https://www.kaggle.com/datasets/danielihenacho/amazon-reviews-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 2, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Daniel Ihenacho
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This dataset was created from the scraped reviews from products in Amazon for the purpose of text classification. The classes are three in number namely; - Negative Reviews - Neutral Reviews - Positive Reviews

Data columns includes; - Sentiments - Cleaned Review - Cleaned Review Length - Review Score

This dataset presents the problem of multiclass classification with the use of ML algorithms and also deep learning algorithms. Moreover, there is a class imbalance; negative reviews has the lowest number of reviews compared to positive and neutral reviews.

For ML algo use a mapping of; negative--> -1, neutral--> 0, positive --> 1

For Deep Learning algo use a mapping of; negative --> 0 neutral --> 1 positive --> 2

Looking forward to your model discoveries on this dataset.

Please leave an upvote if you find this relevant 😀.
h
Amazon-Reviews-2023
huggingface.co
Updated Sep 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
McAuley-Lab (2023). Amazon-Reviews-2023 [Dataset]. https://huggingface.co/datasets/McAuley-Lab/Amazon-Reviews-2023
Explore at:
Dataset updated
Sep 15, 2023
Dataset authored and provided by
McAuley-Lab
Description
Amazon Review 2023 is an updated version of the Amazon Review 2018 dataset. This dataset mainly includes reviews (ratings, text) and item metadata (desc- riptions, category information, price, brand, and images). Compared to the pre- vious versions, the 2023 version features larger size, newer reviews (up to Sep 2023), richer and cleaner meta data, and finer-grained timestamps (from day to milli-second).
f
Some examples of Amazon reviews dataset.
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Feb 14, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Semary, Noura A.; Pławiak, Paweł; Hammad, Mohamed; Ahmed, Wesam; Amin, Khalid (2024). Some examples of Amazon reviews dataset. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001447177
Explore at:
Dataset updated
Feb 14, 2024
Authors
Semary, Noura A.; Pławiak, Paweł; Hammad, Mohamed; Ahmed, Wesam; Amin, Khalid
Description
A crucial part of sentiment classification is featuring extraction because it involves extracting valuable information from text data, which affects the model’s performance. The goal of this paper is to help in selecting a suitable feature extraction method to enhance the performance of sentiment analysis tasks. In order to provide directions for future machine learning and feature extraction research, it is important to analyze and summarize feature extraction techniques methodically from a machine learning standpoint. There are several methods under consideration, including Bag-of-words (BOW), Word2Vector, N-gram, Term Frequency- Inverse Document Frequency (TF-IDF), Hashing Vectorizer (HV), and Global vector for word representation (GloVe). To prove the ability of each feature extractor, we applied it to the Twitter US airlines and Amazon musical instrument reviews datasets. Finally, we trained a random forest classifier using 70% of the training data and 30% of the testing data, enabling us to evaluate and compare the performance using different metrics. Based on our results, we find that the TD-IDF technique demonstrates superior performance, with an accuracy of 99% in the Amazon reviews dataset and 96% in the Twitter US airlines dataset. This study underscores the paramount significance of feature extraction in sentiment analysis, endowing pragmatic insights to elevate model performance and steer future research pursuits.
Amazon Reviews for Sentiment Analysis
kaggle.com
zip
Updated Nov 18, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Adam Bittlingmayer (2019). Amazon Reviews for Sentiment Analysis [Dataset]. https://www.kaggle.com/bittlingmayer/amazonreviews
Explore at:
zip(517080965 bytes)Available download formats
Dataset updated
Nov 18, 2019
Authors
Adam Bittlingmayer
Description
This dataset consists of a few million Amazon customer reviews (input text) and star ratings (output labels) for learning how to train fastText for sentiment analysis.

The idea here is a dataset is more than a toy - real business data on a reasonable scale - but can be trained in minutes on a modest laptop.

Content

The fastText supervised learning tutorial requires data in the following format:

_label_
h
Amazon-Reviews-Dataset
huggingface.co
Updated Aug 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DataHive AI (2025). Amazon-Reviews-Dataset [Dataset]. https://huggingface.co/datasets/datahiveai/Amazon-Reviews-Dataset
Explore at:
Dataset updated
Aug 16, 2025
Dataset authored and provided by
DataHive AI
License
Attribution-NonCommercial 2.0 (CC BY-NC 2.0)https://creativecommons.org/licenses/by-nc/2.0/
License information was derived automatically
Description
This dataset provides a free trial sample of best-selling products and their customer reviews from a leading e-commerce platform, designed to support product intelligence, sentiment analysis, and market trend evaluation. This sample is provided for evaluation purposes only. It includes a curated subset of the full dataset. To access the complete dataset, request additional attributes, or explore alternative product segments, please contact the data provider directly.

Key Features

2… See the full description on the dataset page: https://huggingface.co/datasets/datahiveai/Amazon-Reviews-Dataset.
f
Literature survey of sentiment analysis.
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Feb 14, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Semary, Noura A.; Hammad, Mohamed; Ahmed, Wesam; Amin, Khalid; Pławiak, Paweł (2024). Literature survey of sentiment analysis. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001447137
Explore at:
Dataset updated
Feb 14, 2024
Authors
Semary, Noura A.; Hammad, Mohamed; Ahmed, Wesam; Amin, Khalid; Pławiak, Paweł
Description
A crucial part of sentiment classification is featuring extraction because it involves extracting valuable information from text data, which affects the model’s performance. The goal of this paper is to help in selecting a suitable feature extraction method to enhance the performance of sentiment analysis tasks. In order to provide directions for future machine learning and feature extraction research, it is important to analyze and summarize feature extraction techniques methodically from a machine learning standpoint. There are several methods under consideration, including Bag-of-words (BOW), Word2Vector, N-gram, Term Frequency- Inverse Document Frequency (TF-IDF), Hashing Vectorizer (HV), and Global vector for word representation (GloVe). To prove the ability of each feature extractor, we applied it to the Twitter US airlines and Amazon musical instrument reviews datasets. Finally, we trained a random forest classifier using 70% of the training data and 30% of the testing data, enabling us to evaluate and compare the performance using different metrics. Based on our results, we find that the TD-IDF technique demonstrates superior performance, with an accuracy of 99% in the Amazon reviews dataset and 96% in the Twitter US airlines dataset. This study underscores the paramount significance of feature extraction in sentiment analysis, endowing pragmatic insights to elevate model performance and steer future research pursuits.
h
Amazon_Reviews_Binary_for_Sentiment_Analysis
huggingface.co
Updated Jul 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
yassir acharki (2024). Amazon_Reviews_Binary_for_Sentiment_Analysis [Dataset]. https://huggingface.co/datasets/yassiracharki/Amazon_Reviews_Binary_for_Sentiment_Analysis
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 26, 2024
Authors
yassir acharki
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset Card for Dataset Name

The Amazon reviews polarity dataset is constructed by taking review score 1 and 2 as negative, and 4 and 5 as positive. Samples of score 3 is ignored. In the dataset, class 1 is the negative and class 2 is the positive. Each class has 1,800,000 training samples and 200,000 testing samples.

Dataset Details Dataset Description

The files train.csv and test.csv contain all the training samples as comma-sparated values. There are 3… See the full description on the dataset page: https://huggingface.co/datasets/yassiracharki/Amazon_Reviews_Binary_for_Sentiment_Analysis.
Amazon Reviews for Sentiment Analysis
kaggle.com
Updated Apr 18, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Haobo Xu (2022). Amazon Reviews for Sentiment Analysis [Dataset]. https://www.kaggle.com/datasets/haoboxu/amazon-reviews-for-sentiment-analysis/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 18, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Haobo Xu
License
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
Description
Amazon Reviews of Software, Fashion, Video Games split according to different years. Software: 459,436 reviews Fashion: 883,636 reviews Video Games: 2,565,349 reviews Columns: Id, overall(sentiment),reviewTime(year),reviewText sentiment: {-1(negative),0(neutral),1(positive)}
h
amazon-food-reviews-dataset
huggingface.co
Updated Dec 12, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
misschestnut (2023). amazon-food-reviews-dataset [Dataset]. https://huggingface.co/datasets/jhan21/amazon-food-reviews-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 12, 2023
Authors
misschestnut
License
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Description
Dataset Card for "Amazon Food Reviews"

Dataset Summary

This dataset consists of reviews of fine foods from amazon. The data span a period of more than 10 years, including all ~500,000 reviews up to October 2012. Reviews include product and user information, ratings, and a plain text review. It also includes reviews from all other Amazon categories.

Supported Tasks and Leaderboards

This dataset can be used for numerous tasks like sentiment analysis, text… See the full description on the dataset page: https://huggingface.co/datasets/jhan21/amazon-food-reviews-dataset.
T
amazon_us_reviews
tensorflow.org
huggingface.co
Updated Dec 6, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). amazon_us_reviews [Dataset]. https://www.tensorflow.org/datasets/catalog/amazon_us_reviews
Explore at:
Dataset updated
Dec 6, 2022
Description
Amazon Customer Reviews (a.k.a. Product Reviews) is one of Amazons iconic products. In a period of over two decades since the first review in 1995, millions of Amazon customers have contributed over a hundred million reviews to express opinions and describe their experiences regarding products on the Amazon.com website. This makes Amazon Customer Reviews a rich source of information for academic researchers in the fields of Natural Language Processing (NLP), Information Retrieval (IR), and Machine Learning (ML), amongst others. Accordingly, we are releasing this data to further research in multiple disciplines related to understanding customer product experiences. Specifically, this dataset was constructed to represent a sample of customer evaluations and opinions, variation in the perception of a product across geographical regions, and promotional intent or bias in reviews.

Over 130+ million customer reviews are available to researchers as part of this release. The data is available in TSV files in the amazon-reviews-pds S3 bucket in AWS US East Region. Each line in the data files corresponds to an individual review (tab delimited, with no quote and escape characters).

Each Dataset contains the following columns : marketplace - 2 letter country code of the marketplace where the review was written. customer_id - Random identifier that can be used to aggregate reviews written by a single author. review_id - The unique ID of the review. product_id - The unique Product ID the review pertains to. In the multilingual dataset the reviews for the same product in different countries can be grouped by the same product_id. product_parent - Random identifier that can be used to aggregate reviews for the same product. product_title - Title of the product. product_category - Broad product category that can be used to group reviews (also used to group the dataset into coherent parts). star_rating - The 1-5 star rating of the review. helpful_votes - Number of helpful votes. total_votes - Number of total votes the review received. vine - Review was written as part of the Vine program. verified_purchase - The review is on a verified purchase. review_headline - The title of the review. review_body - The review text. review_date - The date the review was written.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('amazon_us_reviews', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.
Amazon UK shoes products reviews dataset
crawlfeeds.com
csv, zip
Updated Jun 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Crawl Feeds (2025). Amazon UK shoes products reviews dataset [Dataset]. https://crawlfeeds.com/datasets/amazon-uk-shoes-products-reviews-dataset
Explore at:
csv, zipAvailable download formats
Dataset updated
Jun 27, 2025
Dataset authored and provided by
Crawl Feeds
License
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Description
Unlock detailed insights with our Amazon UK Shoes Products Reviews Dataset, an invaluable resource for businesses, researchers, and data analysts. This dataset features comprehensive information, including product names, review texts, star ratings, and customer feedback for a wide range of shoe products available on Amazon UK.

Key Features:

Extensive Coverage: Includes detailed reviews and ratings for various shoe products, helping you analyze customer preferences and trends.

Structured Data: Available in easily accessible formats like product review dataset CSV, making it perfect for integration into your analytical workflows.

Actionable Insights: Leverage this dataset for customer sentiment analysis, product optimization, and competitive benchmarking.

Why Choose the Amazon UK Shoes Products Reviews Dataset?

Whether you're delving into customer behavior, conducting market research, or improving product offerings, this dataset empowers you to make informed decisions. By working with a dataset enriched with real-world feedback, you can:

Understand customer preferences: Dive into detailed reviews to uncover patterns in consumer likes and dislikes.

Enhance product offerings: Identify gaps and opportunities in the market to better meet customer demands.

Boost competitive analysis: Compare customer feedback across different brands and products.

Additional Datasets Available

Explore related datasets like the Amazon product review dataset, offering insights across various categories and regions. For specific needs, our curated product reviews dataset is tailored to help you gain a granular understanding of niche markets.
E
Amazon Fine Food Reviews
live.european-language-grid.eu
csv
Updated Dec 30, 2013
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2013). Amazon Fine Food Reviews [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/4949
Explore at:
csvAvailable download formats
Dataset updated
Dec 30, 2013
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Dataset consists of reviews of fine foods from amazon.
h
Consumer_goods_reviews
huggingface.co
Updated Jan 22, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
kevin kibebe (2025). Consumer_goods_reviews [Dataset]. https://huggingface.co/datasets/kevykibbz/Consumer_goods_reviews
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 22, 2025
Authors
kevin kibebe
Description
Amazon Product Review Dataset (2023)

Dataset Overview

The Amazon Product Review Dataset (2023) contains product reviews from Amazon customers. The dataset includes product information, review details, and metadata about the customers who left the reviews. This dataset can be used for various natural language processing (NLP) tasks, including sentiment analysis, review prediction, recommendation systems, and more.

Dataset Name: Amazon Product Review Dataset (2023) Dataset… See the full description on the dataset page: https://huggingface.co/datasets/kevykibbz/Consumer_goods_reviews.
h
Amazon_Reviews_for_Sentiment_Analysis_fine_grained_5_classes
huggingface.co
Updated Sep 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
yassir acharki (2025). Amazon_Reviews_for_Sentiment_Analysis_fine_grained_5_classes [Dataset]. https://huggingface.co/datasets/yassiracharki/Amazon_Reviews_for_Sentiment_Analysis_fine_grained_5_classes
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 4, 2025
Authors
yassir acharki
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset Card for Dataset Name

The Amazon reviews full score dataset is constructed by randomly taking 600,000 training samples and 130,000 testing samples for each review score from 1 to 5. In total there are 3,000,000 trainig samples and 650,000 testing samples.

Dataset Details Dataset Description

The files train.csv and test.csv contain all the training samples as comma-sparated values. There are 3 columns in them, corresponding to class index (1 to 5)… See the full description on the dataset page: https://huggingface.co/datasets/yassiracharki/Amazon_Reviews_for_Sentiment_Analysis_fine_grained_5_classes.
Amazon reviews
kaggle.com
Updated Oct 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abdallah Wagih Ibrahim (2023). Amazon reviews [Dataset]. https://www.kaggle.com/datasets/abdallahwagih/amazon-reviews/discussion?sort=undefined
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 16, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Abdallah Wagih Ibrahim
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Overview: This dataset contains a subset of Amazon customer reviews from the "Cell Phones & Accessories" category. The dataset provides valuable insights into customer sentiment and opinions related to various cell phone and accessory products available on Amazon. Whether you're interested in natural language processing, sentiment analysis, product recommendations, or market research, this dataset can be a valuable resource.

Context: With the ever-increasing variety of cell phones and accessories available online, understanding customer feedback and preferences is crucial for businesses, researchers, and data enthusiasts. This dataset offers a glimpse into customer sentiments regarding different products, allowing for a wide range of analytical and research applications.

License: Please note that this dataset is for research and analysis purposes only and may be subject to copyright and terms of use from Amazon. Make sure to comply with Amazon's policies when using this data.

Dataset Source: The original dataset was scraped from Amazon's website.
Amazon reviews
kaggle.com
Updated May 15, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kritanjali Jain (2021). Amazon reviews [Dataset]. https://www.kaggle.com/datasets/kritanjalijain/amazon-reviews/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 15, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Kritanjali Jain
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Amazon Review Polarity Dataset

OVERVIEW

Contains 34,686,770 Amazon reviews from 6,643,669 users on 2,441,053 products, from the Stanford Network Analysis Project (SNAP). This subset contains 1,800,000 training samples and 200,000 testing samples in each polarity sentiment.

ORIGIN

The Amazon reviews dataset consists of reviews from amazon. The data span a period of 18 years, including ~35 million reviews up to March 2013. Reviews include product and user information, ratings, and a plaintext review. For more information, please refer to the following paper: J. McAuley and J. Leskovec. Hidden factors and hidden topics: understanding rating dimensions with review text. RecSys, 2013.

DESCRIPTION

The Amazon reviews polarity dataset is constructed by taking review score 1 and 2 as negative, and 4 and 5 as positive. Samples of score 3 is ignored. In the dataset, class 1 is the negative and class 2 is the positive. Each class has 1,800,000 training samples and 200,000 testing samples.

If you need help extracting the train.csv and test.csv files check out the starter code.

The files train.csv and test.csv contain all the training samples as comma-separated values.

The CSVs contain polarity, title, text. These 3 columns in them, correspond to class index (1 or 2), review title and review text.

polarity - 1 for negative and 2 for positive

title - review heading

text - review body

The review title and text are escaped using double quotes ("), and any internal double quote is escaped by 2 double quotes (""). New lines are escaped by a backslash followed with an "n" character, that is " ".

CITATION

The Amazon reviews polarity dataset is constructed by Xiang Zhang (xiang.zhang@nyu.edu). It is used as a text classification benchmark in the following paper: Xiang Zhang, Junbo Zhao, Yann LeCun. Character-level Convolutional Networks for Text Classification. Advances in Neural Information Processing Systems 28 (NIPS 2015).
Amazon Reviews
dataandsons.com
csv, zip
Updated Feb 24, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Peter Parker (2021). Amazon Reviews [Dataset]. https://www.dataandsons.com/categories/machine-learning/amazon-reviews
Explore at:
csv, zipAvailable download formats
Dataset updated
Feb 24, 2021
Dataset provided by
Authors
Peter Parker
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
About this Dataset

Labelled dataset of Amazon reviews to be used for sentiment analysis or emotion-cause detection (.csv format)

Category

Machine Learning

Keywords

Amazon,csv

Row Count

649979

Price

$200.00

Facebook

Twitter

Click to copy link

Link copied

Cite

fastai X Hugging Face Group 2022, amazon-reviews-sentiment-analysis [Dataset]. https://huggingface.co/datasets/hugginglearners/amazon-reviews-sentiment-analysis

amazon-reviews-sentiment-analysis

hugginglearners/amazon-reviews-sentiment-analysis

Explore at:

37 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset provided by

Hugging Facehttps://huggingface.co/

Authors

fastai X Hugging Face Group 2022

License

Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically

Description

Dataset Card for amazon reviews for sentiment analysis

  Dataset Summary

One of the most important problems in e-commerce is the correct calculation of the points given to after-sales products. The solution to this problem is to provide greater customer satisfaction for the e-commerce site, product prominence for sellers, and a seamless shopping experience for buyers. Another problem is the correct ordering of the comments given to the products. The prominence of misleading… See the full description on the dataset page: https://huggingface.co/datasets/hugginglearners/amazon-reviews-sentiment-analysis.

Clear search

Close search

Google apps

Main menu

amazon-reviews-sentiment-analysis

Datasets for Sentiment Analysis

Amazon Reviews Dataset

Amazon Reviews Dataset

Amazon-Reviews-2023

Some examples of Amazon reviews dataset.

Amazon Reviews for Sentiment Analysis

Content

Amazon-Reviews-Dataset

Literature survey of sentiment analysis.

Amazon_Reviews_Binary_for_Sentiment_Analysis

Amazon Reviews for Sentiment Analysis

amazon-food-reviews-dataset

amazon_us_reviews

Amazon UK shoes products reviews dataset

Key Features:

Why Choose the Amazon UK Shoes Products Reviews Dataset?

Additional Datasets Available

Amazon Fine Food Reviews

Consumer_goods_reviews

Amazon_Reviews_for_Sentiment_Analysis_fine_grained_5_classes

Amazon reviews

Amazon reviews

OVERVIEW

ORIGIN

DESCRIPTION

CITATION

Amazon Reviews

About this Dataset

Category

Keywords

Row Count

Price

amazon-reviews-sentiment-analysis

hugginglearners/amazon-reviews-sentiment-analysis