Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Dataset Card for amazon reviews for sentiment analysis
Dataset Summary
One of the most important problems in e-commerce is the correct calculation of the points given to after-sales products. The solution to this problem is to provide greater customer satisfaction for the e-commerce site, product prominence for sellers, and a seamless shopping experience for buyers. Another problem is the correct ordering of the comments given to the products. The prominence of misleading… See the full description on the dataset page: https://huggingface.co/datasets/hugginglearners/amazon-reviews-sentiment-analysis.
Facebook
TwitterOpen Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
This is a small subset of dataset of Book reviews from Amazon Kindle Store category.
5-core dataset of product reviews from Amazon Kindle Store category from May 1996 - July 2014. Contains total of 982619 entries. Each reviewer has at least 5 reviews and each product has at least 5 reviews in this dataset. Columns - asin - ID of the product, like B000FA64PK -helpful - helpfulness rating of the review - example: 2/3. -overall - rating of the product. -reviewText - text of the review (heading). -reviewTime - time of the review (raw). -reviewerID - ID of the reviewer, like A3SPTOKDG7WBLN -reviewerName - name of the reviewer. -summary - summary of the review (description). -unixReviewTime - unix timestamp.
There are two files one is preprocessed ready for sentiment analysis and other is unprocessed to you basically have to process the dataset and then perform sentiment analysis
This dataset is taken from Amazon product data, Julian McAuley, UCSD website. http://jmcauley.ucsd.edu/data/amazon/
License to the data files belong to them.
-Sentiment analysis on reviews. -Understanding how people rate usefulness of a review/ What factors influence helpfulness of a review. -Fake reviews/ outliers. -Best rated product IDs, or similarity between products based on reviews alone (not the best idea ikr). -Any other interesting analysis
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Explore the Amazon Product Reviews Dataset, a treasure trove of valuable insights into customer opinions and sentiments about a wide range of products available on Amazon's platform. This dataset is a goldmine for data enthusiasts, analysts, and machine learning practitioners interested in understanding consumer feedback, sentiment analysis, and product performance evaluation.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository was created for my Master's thesis in Computational Intelligence and Internet of Things at the University of Córdoba, Spain. The purpose of this repository is to store the datasets found that were used in some of the studies that served as research material for this Master's thesis. Also, the datasets used in the experimental part of this work are included.
Below are the datasets specified, along with the details of their references, authors, and download sources.
----------- STS-Gold Dataset ----------------
The dataset consists of 2026 tweets. The file consists of 3 columns: id, polarity, and tweet. The three columns denote the unique id, polarity index of the text and the tweet text respectively.
Reference: Saif, H., Fernandez, M., He, Y., & Alani, H. (2013). Evaluation datasets for Twitter sentiment analysis: a survey and a new dataset, the STS-Gold.
File name: sts_gold_tweet.csv
----------- Amazon Sales Dataset ----------------
This dataset is having the data of 1K+ Amazon Product's Ratings and Reviews as per their details listed on the official website of Amazon. The data was scraped in the month of January 2023 from the Official Website of Amazon.
Owner: Karkavelraja J., Postgraduate student at Puducherry Technological University (Puducherry, Puducherry, India)
Features:
License: CC BY-NC-SA 4.0
File name: amazon.csv
----------- Rotten Tomatoes Reviews Dataset ----------------
This rating inference dataset is a sentiment classification dataset, containing 5,331 positive and 5,331 negative processed sentences from Rotten Tomatoes movie reviews. On average, these reviews consist of 21 words. The first 5331 rows contains only negative samples and the last 5331 rows contain only positive samples, thus the data should be shuffled before usage.
This data is collected from https://www.cs.cornell.edu/people/pabo/movie-review-data/ as a txt file and converted into a csv file. The file consists of 2 columns: reviews and labels (1 for fresh (good) and 0 for rotten (bad)).
Reference: Bo Pang and Lillian Lee. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05), pages 115–124, Ann Arbor, Michigan, June 2005. Association for Computational Linguistics
File name: data_rt.csv
----------- Preprocessed Dataset Sentiment Analysis ----------------
Preprocessed amazon product review data of Gen3EcoDot (Alexa) scrapped entirely from amazon.in
Stemmed and lemmatized using nltk.
Sentiment labels are generated using TextBlob polarity scores.
The file consists of 4 columns: index, review (stemmed and lemmatized review using nltk), polarity (score) and division (categorical label generated using polarity score).
DOI: 10.34740/kaggle/dsv/3877817
Citation: @misc{pradeesh arumadi_2022, title={Preprocessed Dataset Sentiment Analysis}, url={https://www.kaggle.com/dsv/3877817}, DOI={10.34740/KAGGLE/DSV/3877817}, publisher={Kaggle}, author={Pradeesh Arumadi}, year={2022} }
This dataset was used in the experimental phase of my research.
File name: EcoPreprocessed.csv
----------- Amazon Earphones Reviews ----------------
This dataset consists of a 9930 Amazon reviews, star ratings, for 10 latest (as of mid-2019) bluetooth earphone devices for learning how to train Machine for sentiment analysis.
This dataset was employed in the experimental phase of my research. To align it with the objectives of my study, certain reviews were excluded from the original dataset, and an additional column was incorporated into this dataset.
The file consists of 5 columns: ReviewTitle, ReviewBody, ReviewStar, Product and division (manually added - categorical label generated using ReviewStar score)
License: U.S. Government Works
Source: www.amazon.in
File name (original): AllProductReviews.csv (contains 14337 reviews)
File name (edited - used for my research) : AllProductReviews2.csv (contains 9930 reviews)
----------- Amazon Musical Instruments Reviews ----------------
This dataset contains 7137 comments/reviews of different musical instruments coming from Amazon.
This dataset was employed in the experimental phase of my research. To align it with the objectives of my study, certain reviews were excluded from the original dataset, and an additional column was incorporated into this dataset.
The file consists of 10 columns: reviewerID, asin (ID of the product), reviewerName, helpful (helpfulness rating of the review), reviewText, overall (rating of the product), summary (summary of the review), unixReviewTime (time of the review - unix time), reviewTime (time of the review (raw) and division (manually added - categorical label generated using overall score).
Source: http://jmcauley.ucsd.edu/data/amazon/
File name (original): Musical_instruments_reviews.csv (contains 10261 reviews)
File name (edited - used for my research) : Musical_instruments_reviews2.csv (contains 7137 reviews)
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
About Datasets: - Domain : Marketing - Project: Amazon Product Review Sentiment Analysis - Datasets: Reviews.csv - Dataset Type: Excel Data - Dataset Size: 56L+ records
KPI's: 1. Distribution of Amazon Product Ratings 2. How most people rated the products they bought from Amazon 3. Total of all sentiment scores
Process: 1. Understanding the problem 2. Data Collection 3. Data Cleaning 4. Exploring and analyzing the data 5. Interpreting the results
This data contains pandas, seaborn, matplotlib, nltk.sentiment.vader, SentimentIntensityAnalyzer, value_counts(), custom_colors, figsize, pie, sentiment_score
Facebook
TwitterThis dataset was created by Krishna Singh Rajput
Released under Other (specified in description)
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Amazon and IMDb Movie Reviews Combined Dataset
This dataset combines short review texts sourced from publicly available Amazon Product Reviews and IMDb Movie Reviews. Each review has been cleaned, standardized, and labelled with a binary sentiment value where 0 indicates a negative sentiment and 1 indicates a positive sentiment. The text was preprocessed to remove HTML tags, punctuation, and unnecessary whitespace, and was converted to lowercase to ensure consistency.
Facebook
TwitterA crucial part of sentiment classification is featuring extraction because it involves extracting valuable information from text data, which affects the model’s performance. The goal of this paper is to help in selecting a suitable feature extraction method to enhance the performance of sentiment analysis tasks. In order to provide directions for future machine learning and feature extraction research, it is important to analyze and summarize feature extraction techniques methodically from a machine learning standpoint. There are several methods under consideration, including Bag-of-words (BOW), Word2Vector, N-gram, Term Frequency- Inverse Document Frequency (TF-IDF), Hashing Vectorizer (HV), and Global vector for word representation (GloVe). To prove the ability of each feature extractor, we applied it to the Twitter US airlines and Amazon musical instrument reviews datasets. Finally, we trained a random forest classifier using 70% of the training data and 30% of the testing data, enabling us to evaluate and compare the performance using different metrics. Based on our results, we find that the TD-IDF technique demonstrates superior performance, with an accuracy of 99% in the Amazon reviews dataset and 96% in the Twitter US airlines dataset. This study underscores the paramount significance of feature extraction in sentiment analysis, endowing pragmatic insights to elevate model performance and steer future research pursuits.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A crucial part of sentiment classification is featuring extraction because it involves extracting valuable information from text data, which affects the model’s performance. The goal of this paper is to help in selecting a suitable feature extraction method to enhance the performance of sentiment analysis tasks. In order to provide directions for future machine learning and feature extraction research, it is important to analyze and summarize feature extraction techniques methodically from a machine learning standpoint. There are several methods under consideration, including Bag-of-words (BOW), Word2Vector, N-gram, Term Frequency- Inverse Document Frequency (TF-IDF), Hashing Vectorizer (HV), and Global vector for word representation (GloVe). To prove the ability of each feature extractor, we applied it to the Twitter US airlines and Amazon musical instrument reviews datasets. Finally, we trained a random forest classifier using 70% of the training data and 30% of the testing data, enabling us to evaluate and compare the performance using different metrics. Based on our results, we find that the TD-IDF technique demonstrates superior performance, with an accuracy of 99% in the Amazon reviews dataset and 96% in the Twitter US airlines dataset. This study underscores the paramount significance of feature extraction in sentiment analysis, endowing pragmatic insights to elevate model performance and steer future research pursuits.
Facebook
TwitterThis dataset was created by Tamilarasan Pravin
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Card for Dataset Name
The Amazon reviews polarity dataset is constructed by taking review score 1 and 2 as negative, and 4 and 5 as positive. Samples of score 3 is ignored. In the dataset, class 1 is the negative and class 2 is the positive. Each class has 1,800,000 training samples and 200,000 testing samples.
Dataset Details
Dataset Description
The files train.csv and test.csv contain all the training samples as comma-sparated values. There are 3… See the full description on the dataset page: https://huggingface.co/datasets/yassiracharki/Amazon_Reviews_Binary_for_Sentiment_Analysis.
Facebook
Twitterhttps://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Unlock detailed insights with our Amazon UK Shoes Products Reviews Dataset, an invaluable resource for businesses, researchers, and data analysts. This dataset features comprehensive information, including product names, review texts, star ratings, and customer feedback for a wide range of shoe products available on Amazon UK.
Whether you're delving into customer behavior, conducting market research, or improving product offerings, this dataset empowers you to make informed decisions. By working with a dataset enriched with real-world feedback, you can:
Explore related datasets like the Amazon product review dataset, offering insights across various categories and regions. For specific needs, our curated product reviews dataset is tailored to help you gain a granular understanding of niche markets.
Facebook
Twitterhttps://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Dataset Card for "Amazon Beauty Reviews"
Dataset Summary
This dataset consists of reviews of "All Beauty" category from amazon. The data includes all ~700,000 reviews up to 2023. Reviews include product and user information, ratings, and a plain text review.
Supported Tasks and Leaderboards
This dataset can be used for numerous tasks like sentiment analysis, text classification, and user behavior analysis. It's particularly useful for training models to… See the full description on the dataset page: https://huggingface.co/datasets/jhan21/amazon-beauty-reviews-dataset.
Facebook
Twitterhttps://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Dataset Card for "Amazon Food Reviews"
Dataset Summary
This dataset consists of reviews of fine foods from amazon. The data span a period of more than 10 years, including all ~500,000 reviews up to October 2012. Reviews include product and user information, ratings, and a plain text review. It also includes reviews from all other Amazon categories.
Supported Tasks and Leaderboards
This dataset can be used for numerous tasks like sentiment analysis, text… See the full description on the dataset page: https://huggingface.co/datasets/jhan21/amazon-food-reviews-dataset.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Dataset consists of reviews of fine foods from amazon.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Labelled dataset of Amazon reviews to be used for sentiment analysis or emotion-cause detection (.csv format)
Machine Learning
Amazon,csv
649979
$200.00
Facebook
Twitterhttps://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
34,686,770 Amazon reviews from 6,643,669 users on 2,441,053 products, from the Stanford Network Analysis Project (SNAP). This subset contains 1,800,000 training samples and 200,000 testing samples in each polarity sentiment.
Facebook
TwitterAmazon Product Review Dataset (2023)
Dataset Overview
The Amazon Product Review Dataset (2023) contains product reviews from Amazon customers. The dataset includes product information, review details, and metadata about the customers who left the reviews. This dataset can be used for various natural language processing (NLP) tasks, including sentiment analysis, review prediction, recommendation systems, and more.
Dataset Name: Amazon Product Review Dataset (2023) Dataset… See the full description on the dataset page: https://huggingface.co/datasets/kevykibbz/Amazon_Customer_Review_2023.
Facebook
TwitterThis dataset was created by Sashminda Iranga
Released under Other (specified in description)
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
By [source]
This dataset contains an expansive collection of Amazon customer reviews ranging from 2013 to 2019 found across various categories of products, such as smartphones, laptops, books, and refrigerators. Each customer has their own unique ID, accompanied by a review header containing the title of their review as well as a detailed description and overall rating given by the customer according to their experience. Moreover, we have included our own sentiment analysis providing an additional layer to these reviews - breaking them down into ratings for positive or negative sentiment. With our invaluable insights into customers thoughts and feelings about different products across various categories over 6 years of reviews - this dataset is valuable resource for anyone interested in discovering trends on Amazon's customer base
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
If you use this dataset in your research, please credit the original authors. Data Source
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
File: Amazon Review Data Web Scrapping - Amazon Review Data Web Scrapping.csv | Column name | Description | |:------------------|:----------------------------------------------------------------| | Category | The product category of the review. (String) | | Review_Header | The title of the customer review. (String) | | Review_text | The detailed text of the customer review. (String) | | Rating | The customer rating of the product. (Integer) | | Own_Rating | The sentiment analysis rating of the customer review. (Integer) |
If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit .
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Dataset Card for amazon reviews for sentiment analysis
Dataset Summary
One of the most important problems in e-commerce is the correct calculation of the points given to after-sales products. The solution to this problem is to provide greater customer satisfaction for the e-commerce site, product prominence for sellers, and a seamless shopping experience for buyers. Another problem is the correct ordering of the comments given to the products. The prominence of misleading… See the full description on the dataset page: https://huggingface.co/datasets/hugginglearners/amazon-reviews-sentiment-analysis.