Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset comprises customer reviews for Amazon, an online retail giant, featuring insights into customer experiences, including ratings, review titles, texts, and metadata. It is valuable for analyzing customer satisfaction, sentiment, and trends.
Column Descriptions:
Reviewer Name: Identifies the reviewer. Profile Link: Links to the reviewer's profile for additional insights. Country: Indicates the reviewer's location. Review Count: Number of reviews by the same user, showing engagement level. Review Date: When the review was posted, useful for time analysis. Rating: Numerical satisfaction measure. Review Title: Summarizes the review sentiment. Review Text: Detailed customer feedback. Date of Experience: When the service/product was experienced.
Prospective applications:
Sentiment Analysis: Analyze review texts and titles to assess overall customer sentiment toward products, enabling the identification of strengths and weaknesses. Customer Satisfaction Tracking: Track and visualize rating trends over time to understand fluctuations in customer satisfaction. Product Improvement: Identify common themes in reviews to highlight areas for product enhancement or development. Market Segmentation: Use country and demographic information to customize marketing strategies and gain insights into regional preferences. Competitor Analysis: Evaluate customer feedback on Amazon products in comparison to competitors to determine market positioning. Recommendation Systems: Leverage review data to enhance recommendation algorithms, improving personalized shopping experiences. Trend Analysis: Investigate temporal patterns in reviews to link sentiment changes with marketing efforts or product launches.
This extensive dataset serves as a valuable asset for various analyses focused on enhancing customer engagement and refining business strategies.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset encompasses a comprehensive collection of over 800,000 URLs, meticulously curated to provide a diverse representation of online domains. Within this extensive corpus, approximately 52% of the domains are identified as legitimate, reflective of established and trustworthy entities within the digital landscape. Conversely, the remaining 47% of domains are categorized as phishing domains, indicative of potential threats and malicious activities.
Structured with precision, the dataset comprises two key columns: "url" and "status". The "url" column serves as the primary identifier, housing the uniform resource locators (URLs) for each respective domain. Meanwhile, the "status" column employs binary encoding, with values represented as 0 and 1. Herein lies a crucial distinction: a value of 0 designates domains flagged as phishing, signaling a potential risk to users, while a value of 1 signifies domains deemed legitimate, offering assurance and credibility. Additionally paramount importance is the careful balance maintained between these two categories. With an almost equal distribution of instances across phishing and legitimate domains, this dataset mitigates the risk of class imbalance, ensuring robustness and reliability in subsequent analyses and model development. This deliberate approach fosters a more equitable and representative dataset, empowering researchers and practitioners in their endeavors to understand, combat, and mitigate online threats.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset was created from the scraped reviews from products in Amazon for the purpose of text classification. The classes are three in number namely; - Negative Reviews - Neutral Reviews - Positive Reviews
Data columns includes; - Sentiments - Cleaned Review - Cleaned Review Length - Review Score
This dataset presents the problem of multiclass classification with the use of ML algorithms and also deep learning algorithms. Moreover, there is a class imbalance; negative reviews has the lowest number of reviews compared to positive and neutral reviews.
For ML algo use a mapping of; negative--> -1, neutral--> 0, positive --> 1
For Deep Learning algo use a mapping of; negative --> 0 neutral --> 1 positive --> 2
Looking forward to your model discoveries on this dataset.
Please leave an upvote if you find this relevant π.
Pandemic has influenced all spheres of the humanity. COVID-19 impacted the education vertical in larger manner. Traditional classroom environment plays a very vital role in molding the life of an individual. Bond nurtured in the early ages of the life acts as the great moral support in the latter stages of the journey. As the pandemic has forced us into online education, this data collection aims to analyze the impact of online education. To check out the satisfactory level of the learners, review was conducted.
Gender β Male, Female Home Location β Rural, Urban Level of Education β Post Graduate, School, Under Graduate Age β Years Number of Subjects β 1- 20 Device type used to attend classes β Desktop, Laptop, Mobile Economic status β Middle Class, Poor, Rich Family size β 1 -10 Internet facility in your locality β Number scale (Very Bad to Very Good) Are you involved in any sports? β Yes, No Do elderly people monitor you? β Yes, No Study time β Hours Sleep time β Hours Time spent on social media β Hours Interested in Gaming? β Yes, No Have separate room for studying? β Yes, No Engaged in group studies? β Yes, No Average marks scored before pandemic in traditional classroom β range Your interaction in online mode - Number scale (Very Bad to Very Good) Clearing doubts with faculties in online mode - Number scale (Very Bad to Very Good) Interested in? β Practical, Theory, Both Performance in online - Number scale (Very Bad to Very Good) Your level of satisfaction in Online Education β Average, Bad, Good
radhakrishnan, sujatha (2021), βOnline Education System - Reviewβ, Mendeley Data, V1, doi: 10.17632/bzk9zbyvv7.1
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset consists of reviews of fine foods from amazon. The data span a period of more than 10 years, including all ~500,000 reviews up to October 2012. Reviews include product and user information, ratings, and a plain text review. It also includes reviews from all other Amazon categories.
Data includes:
- Reviews from Oct 1999 - Oct 2012
- 568,454 reviews
- 256,059 users
- 74,258 products
- 260 users with > 50 reviews
See this SQLite query for a quick sample of the dataset.
If you publish articles based on this dataset, please cite the following paper:
Speakers Reviews Dataset This dataset contains reviews of various speakers from different brands, giving insights into customer experiences, ratings, and feedback. Hereβs what youβll typically find in the dataset:
Whatβs in it?
Review ID: A unique code for each review.
Title: The headline of the review, usually a short summary of the user's opinion.
Author: The name (or username) of the person who wrote the review.
Rating:The star rating given by the reviewer (usually out of 5).
Content:The full text of the review, where people share what they liked or didnβt like about the speaker.
Timestamp: When the review was posted (date and time).
Verified Purchase:Whether the reviewer actually bought the speaker or not.
Helpful Count:How many people found this review useful (itβs a thumbs-up count).
Product Attributes: Details about the speaker itself, like brand, model, features, etc.
Company :Name of the company.
Amazon Customer Reviews (a.k.a. Product Reviews) is one of Amazonβs iconic products. In a period of over two decades since the first review in 1995, millions of Amazon customers have contributed over a hundred million reviews to express opinions and describe their experiences regarding products on the Amazon.com website. This makes Amazon Customer Reviews a rich source of information for academic researchers in the fields of Natural Language Processing (NLP), Information Retrieval (IR), and Machine Learning (ML), amongst others. Accordingly, we are releasing this data to further research in multiple disciplines related to understanding customer product experiences. Specifically, this dataset was constructed to represent a sample of customer evaluations and opinions, variation in the perception of a product across geographical regions, and promotional intent or bias in reviews.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
The dataset provides a collection of game reviews from the Steam platform, making it suitable for natural language processing (NLP) and machine learning projects. The columns included are:
Potential Use Cases: - Sentiment classification (positive vs. negative). - Textual exploration (e.g., identifying frequently used words in positive reviews). - Training NLP models.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains reviews for one of the most popular mobile app - tiktok. All the publicly posted reviews are scraped from the google play store.
The dataset used for this project consists of customer reviews with the following columns:
The dataset contains 17,340 entries with three columns. The data is loaded from a CSV file.
The dataset was generated from a shared Kaggle dataset of Amazon reviews and fully translated into Italian using a Python script with Google APIs. The dataset is very rich, containing over 17,000 reviews.
However, it has one issue: it is highly imbalanced. This imbalance influenced the decision to work with this dataset for experiments during the model training phase.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Afer files review-chekpoints--2025-06-27--12796-12797
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Afer files review-chekpoints--2025-07-06--12814-12815
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by MOHAMED AMINE SABBAHI
Released under Apache 2.0
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Yelp hotels and restaurants reviews ( spam and not spam) with sentiments ( positive, negative, and neutral) and review features.
Please cite following our published works, when used this dataset.
1. Naveed Hussain, Hamid Turab Mirza, Faiza Iqbal, Ibrar Hussain, and Mohammad Kaleem. "Detecting Spam Product Reviews in Roman Urdu Script." The Computer Journal (2020).
2. Naveed Hussain, Hamid Turab Mirza, Abid Ali, Faiza Iqbal, Ibrar Hussain, and Mohammad Kaleem. " Spammer group detection and diversification of customersβ reviews ". PeerJ Computer Science 7:e472 https://doi.org/10.7717/peerj-cs.472 (2021).
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Afer files review-chekpoints--2025-02-03--12377-12378
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This Dataset is a collection of Reviews of Google Apps available on playstore. Contains more than 90,000 cumulative App reviews on various Google Apps.
This Dataset contains: 1.) The basic description of apps(for e.g. App Title,App Description,Number of Installs,etc.) 2.) ReviewID 3.) Score and Review by the User and thumbsUp count on the reviews. 4.) Review creation and reply by developer date and time. 5.) The App's Review by the Users
Not many datasets are available on app reviews on Kaggle
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains 20647 amazon reviews for 836 data-science related books. Every review consists of review text and score (number of stars from 1 to 5).
Thanks to all the people who write reviews.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset, raw_review_Toys_and_Games, contains 100,000 Amazon product reviews from the Toys & Games category, sampled from 2023. It includes ratings, review text, product identifiers, user details, timestamps, helpful votes, and purchase verification status.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The Google Play App Reviews dataset contains valuable feedback from users who have reviewed apps on the Google Play Store. This dataset includes both user ratings and detailed comments, making it ideal for sentiment analysis, user experience evaluation, and app performance research.
Column Name | Description |
---|---|
review_id | Unique identifier for each review. π |
user_name | Name of the user who submitted the review. π€ |
review_title | Title of the review (may be empty in some cases). π |
review_description | The content or feedback given by the user about the app. π¬ |
rating | Rating given by the user, ranging from 1 (low) to 5 (high). β |
thumbs_up | Number of thumbs up the review received. π |
review_date | Date and time the review was submitted. π |
developer_response | Response from the app developer (if provided). π¬π¨βπ» |
developer_response_date | Date when the developer responded to the review. π π» |
appVersion | The version of the app when the review was submitted. π±π’ |
language_code | The language in which the review was written (e.g., 'en' for English). π£οΈ |
country_code | The country of the user based on their review (e.g., 'us' for United States). π |
Ready to dive into the world of app feedback and sentiment analysis? Explore the dataset, build models to understand user sentiments, and enhance app experiences based on real feedback.
Happy coding! β¨
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by Vishnupriya
Released under Apache 2.0
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset comprises customer reviews for Amazon, an online retail giant, featuring insights into customer experiences, including ratings, review titles, texts, and metadata. It is valuable for analyzing customer satisfaction, sentiment, and trends.
Column Descriptions:
Reviewer Name: Identifies the reviewer. Profile Link: Links to the reviewer's profile for additional insights. Country: Indicates the reviewer's location. Review Count: Number of reviews by the same user, showing engagement level. Review Date: When the review was posted, useful for time analysis. Rating: Numerical satisfaction measure. Review Title: Summarizes the review sentiment. Review Text: Detailed customer feedback. Date of Experience: When the service/product was experienced.
Prospective applications:
Sentiment Analysis: Analyze review texts and titles to assess overall customer sentiment toward products, enabling the identification of strengths and weaknesses. Customer Satisfaction Tracking: Track and visualize rating trends over time to understand fluctuations in customer satisfaction. Product Improvement: Identify common themes in reviews to highlight areas for product enhancement or development. Market Segmentation: Use country and demographic information to customize marketing strategies and gain insights into regional preferences. Competitor Analysis: Evaluate customer feedback on Amazon products in comparison to competitors to determine market positioning. Recommendation Systems: Leverage review data to enhance recommendation algorithms, improving personalized shopping experiences. Trend Analysis: Investigate temporal patterns in reviews to link sentiment changes with marketing efforts or product launches.
This extensive dataset serves as a valuable asset for various analyses focused on enhancing customer engagement and refining business strategies.