31 datasets found
  1. m

    BDFoodSent: A Large-Scale Sentiment-Labeled Restaurant Review Dataset from...

    • data.mendeley.com
    Updated Dec 2, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ehsanur Rahman Rhythm (2024). BDFoodSent: A Large-Scale Sentiment-Labeled Restaurant Review Dataset from Bangladesh [Dataset]. http://doi.org/10.17632/532fxhnwbb.2
    Explore at:
    Dataset updated
    Dec 2, 2024
    Authors
    Ehsanur Rahman Rhythm
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Bangladesh
    Description

    BDFoodReview is a large-scale dataset containing 334,119 restaurant reviews collected from "Foodpanda Bangladesh". The dataset includes customer reviews in mixed languages (Bangla, English, and Banglish), translated into English, along with their corresponding ratings and sentiment labels.

    Dataset Statistics Total Reviews: 334,119 Features/Columns: 19

    Potential Applications Sentiment Analysis Restaurant Review Classification Customer Satisfaction Analysis Opinion Mining Natural Language Processing Research Food Service Industry Analysis

  2. R

    Aspect Based Sentiment Analysis - Indonesian Restaurant Review

    • dataverse.telkomuniversity.ac.id
    tsv
    Updated Mar 4, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Root (2022). Aspect Based Sentiment Analysis - Indonesian Restaurant Review [Dataset]. http://doi.org/10.34820/FK2/UVHF0F
    Explore at:
    tsv(916531)Available download formats
    Dataset updated
    Mar 4, 2022
    Dataset provided by
    Root
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The dataset used in Aspect Term Extraction Using Deep Learning-Based Approach on Indonesian Restaurant Reviews paper, by Rachmansyah Adhi Widhianto and Ade Romadhony.

  3. Sentiment Analysis Data

    • kaggle.com
    Updated Aug 4, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shubham Singh (2021). Sentiment Analysis Data [Dataset]. https://www.kaggle.com/datasets/shub99/sentiment-analysis-data/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 4, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Shubham Singh
    License

    https://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/

    Description

    Context

    This is a dataset for practicing Sentiment Analysis, Text analytics and classification etc. Dataset will be updated on regular basis by scrapping reviews from websites .

    Content

    Analyse your NLP skills and make some amazing notebooks and perform some text classification . The reviews can be positive or negative/liked or disliked . Positive Review : 1 Negative Review : 0

  4. t

    Kaggle Restaurant Reviews Dataset - Dataset - LDM

    • service.tib.eu
    Updated Nov 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Kaggle Restaurant Reviews Dataset - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/kaggle-restaurant-reviews-dataset
    Explore at:
    Dataset updated
    Nov 25, 2024
    Description

    The Kaggle sentiment analysis competition dataset contains unlabeled restaurant reviews used to supplement the labeled SemEval dataset for improved performance in sentiment analysis.

  5. E

    Restaurant Reviews CZ ABSA corpus v2

    • live.european-language-grid.eu
    binary format
    Updated Dec 31, 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2015). Restaurant Reviews CZ ABSA corpus v2 [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/1142
    Explore at:
    binary formatAvailable download formats
    Dataset updated
    Dec 31, 2015
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Restaurant Reviews CZ ABSA - 2.15k reviews with their related target and category

    The work done is described in the paper: https://doi.org/10.13053/CyS-20-3-2469

  6. Sentiment Analysis data for handymen reviews

    • kaggle.com
    Updated Mar 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Leo Jack (2025). Sentiment Analysis data for handymen reviews [Dataset]. https://www.kaggle.com/datasets/leojack4/sentiment-analysis-data-for-handymen-reviews/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 18, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Leo Jack
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Description This dataset contains customer reviews related to handyman services. It has been adapted from a restaurant review dataset by transforming food-related terms into handyman service-related terms. The dataset can be used for sentiment analysis, natural language processing (NLP), and customer feedback analysis in the service industry.

    Dataset Features Review (String): Customer feedback about the handyman service, detailing their experience with repairs, maintenance, or installations. Liked (Categorical: Yes/No): Indicates whether the customer was satisfied with the service (Yes) or dissatisfied (No). Usage This dataset is ideal for:

    Sentiment Analysis: Train models to classify positive and negative reviews. Customer Experience Research: Identify trends in customer satisfaction and complaints. NLP Applications: Test and develop text classification, keyword extraction, and sentiment prediction models. Potential Applications Developing a handyman service recommendation system. Analyzing customer sentiments to improve service quality. Training machine learning models for automated review classification. Acknowledgments This dataset was transformed from an original restaurant review dataset to suit handyman services, making it relevant for research in service industry sentiment analysis.

  7. i

    Structured Zomato Restaurant Metadata and Review Corpus for Sentiment and...

    • ieee-dataport.org
    Updated Apr 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harsh Mishra (2025). Structured Zomato Restaurant Metadata and Review Corpus for Sentiment and Recommendation Analysis [Dataset]. https://ieee-dataport.org/documents/structured-zomato-restaurant-metadata-and-review-corpus-sentiment-and-recommendation
    Explore at:
    Dataset updated
    Apr 9, 2025
    Authors
    Harsh Mishra
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    rating

  8. f

    TripAdvisor reviews of hotels and restaurants by gender

    • figshare.com
    zip
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mike Thelwall (2023). TripAdvisor reviews of hotels and restaurants by gender [Dataset]. http://doi.org/10.6084/m9.figshare.6255284.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    figshare
    Authors
    Mike Thelwall
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Datasets of Tripadvisor reviews by UK residents of UK hotels and restaurants, together with the user's rating of the hotel.Datasets are split by:Hotel star level (2, 3, 4 or all[mixed]) or Restaurant;Reviewer gender (M=male-authored reviews; F=female-authored reviews; MF=equal numbers of male and female authored reviews for each rating level);Number of texts (1k, 2k, 4k, 8k, 16k, or all available)Each dataset contains equal numbers of reviews at each rating level.The reviews were selected at random from TripAdvisor.This data is from this paper:Thelwall, M. (2018). Gender bias in machine learning for sentiment analysis. Online Information Review, 42(3), 343-354. doi: 10.1108/OIR-05-2017-0152

  9. S

    Restaurant and laptop review dataset

    • scidb.cn
    Updated Sep 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chengwei Cao (2023). Restaurant and laptop review dataset [Dataset]. http://doi.org/10.57760/sciencedb.11267
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 18, 2023
    Dataset provided by
    Science Data Bank
    Authors
    Chengwei Cao
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The file contains four public datasets, which are taken from user comments on Laptop and Restaurant. Each dataset is composed of review, aspect words composed of one or more words, and the sentiment polarity of the aspect word. The sentiment polarity is composed of Positive, Neutral, and Negative.

  10. Sentiment Analysis Classification

    • kaggle.com
    zip
    Updated Sep 14, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Prasanna Venkatesh (2019). Sentiment Analysis Classification [Dataset]. https://www.kaggle.com/datasets/prasy46/sentiment-analysis-classification
    Explore at:
    zip(3169977 bytes)Available download formats
    Dataset updated
    Sep 14, 2019
    Authors
    Prasanna Venkatesh
    Description

    Data

    We provide you with a data set in CSV format. The data set contains food review for the restaurant

    The target variable is labeled Sentiment.

    Task

    Create a Classification model to predict the target variable Sentiment.

    1. A report - A Power point presentation
    2. Any custom code you used
    3. Instructions for me to run your model on a separate data set

    What should be in the report?

    1. List of any assumptions that you made
    2. Description of your methodology and solution path
    3. List of algorithms and techniques you used
    4. List of tools and frameworks you used
    5. Results and evaluation of your models

    How to evaluate the model

    1. Use the Accuracy score
  11. E

    SemEval-2016 ABSA Restaurant Reviews-French: Test Data-Phase A (Subtask 1)

    • live.european-language-grid.eu
    txt
    Updated Oct 3, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). SemEval-2016 ABSA Restaurant Reviews-French: Test Data-Phase A (Subtask 1) [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/674
    Explore at:
    txtAvailable download formats
    Dataset updated
    Oct 3, 2022
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Area covered
    French
    Description

    The restaurant test data for Subtask 1 Phase A evaluation of the SemEval 2016 Task 5: Aspect Based Sentiment Analysis (ABSA) for French (120 reviews, 668 sentences).

  12. h

    ar_res_reviews

    • huggingface.co
    Updated Sep 22, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hady Elsahar (2013). ar_res_reviews [Dataset]. https://huggingface.co/datasets/hadyelsahar/ar_res_reviews
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 22, 2013
    Authors
    Hady Elsahar
    License

    https://choosealicense.com/licenses/unknown/https://choosealicense.com/licenses/unknown/

    Description

    Dataset Card for ArRestReviews

      Dataset Summary
    

    Dataset of 8364 restaurant reviews from qaym.com in Arabic for sentiment analysis

      Supported Tasks and Leaderboards
    

    [More Information Needed]

      Languages
    

    The dataset is based on Arabic.

      Dataset Structure
    
    
    
    
    
      Data Instances
    

    A typical data point comprises of the following:

    "polarity": which is a string value of either 0 or 1 indicating the sentiment around the review

    "text": is the… See the full description on the dataset page: https://huggingface.co/datasets/hadyelsahar/ar_res_reviews.

  13. C

    Consumer Ratings & Reviews Software Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Apr 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Consumer Ratings & Reviews Software Report [Dataset]. https://www.datainsightsmarket.com/reports/consumer-ratings-reviews-software-1369808
    Explore at:
    ppt, doc, pdfAvailable download formats
    Dataset updated
    Apr 13, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Consumer Ratings & Reviews Software market is experiencing robust growth, projected to reach $478 million in 2025 and exhibiting a Compound Annual Growth Rate (CAGR) of 12.7% from 2025 to 2033. This expansion is fueled by the increasing reliance of businesses across diverse sectors—including retail, logistics, media & entertainment, travel & hospitality, and healthcare—on online reviews to enhance brand reputation, drive customer loyalty, and inform business strategies. The shift towards cloud-based solutions simplifies deployment and accessibility, contributing significantly to market growth. Furthermore, the rising adoption of social media and the increasing consumer expectation for transparent and readily available reviews are key drivers. Competitive pressures are driving innovation, with companies constantly refining their offerings to provide comprehensive analytics, sentiment analysis, and automated response features. Segmentation by application and deployment type reflects the market's adaptability to diverse business needs. The North American market currently holds a significant share, driven by early adoption and established e-commerce infrastructure, but growth in regions like Asia-Pacific, fueled by rapid digitalization, presents lucrative opportunities. While the market enjoys considerable momentum, challenges remain. Data security and privacy concerns surrounding sensitive customer information are crucial considerations for businesses. Integration with existing CRM and marketing platforms can also pose complexities for some companies. However, the overall trend points towards continued expansion, with a focus on improving the accuracy and authenticity of reviews and the development of more sophisticated analytics capabilities to leverage review data for strategic decision-making. This includes incorporating AI and machine learning to better understand customer sentiment and identify areas for improvement. The emergence of specialized solutions catering to specific industry needs will further fragment the market, yet simultaneously enhance its reach and overall impact on business operations.

  14. h

    turkish-sentiment-dataset

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    han, turkish-sentiment-dataset [Dataset]. https://huggingface.co/datasets/hanerdem/turkish-sentiment-dataset
    Explore at:
    Authors
    han
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Turkish Sentiment Dataset (Restaurant Reviews)

    This dataset contains 15 manually created Turkish sentences labeled for sentiment analysis. The sentences are related to restaurant experiences and are categorized as positive, negative, or neutral.

      Labels
    

    olumlu (positive) olumsuz (negative) nötr (neutral)

      Format
    

    The dataset uses $ as a delimiter to avoid issues with commas inside the text.

      Example
    

    Yemeklerin lezzeti ve sunumu tam bir şölendi, şefi… See the full description on the dataset page: https://huggingface.co/datasets/hanerdem/turkish-sentiment-dataset.

  15. u

    Data from: A TripAdvisor Dataset for Dyadic Context Analysis

    • portalinvestigacion.udc.gal
    Updated 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    López-Riobóo Botana, Iñigo Luis; Alonso-Betanzos, Amparo; Bolón-Canedo, Verónica; Guijarro-Berdiñas, Bertha; López-Riobóo Botana, Iñigo Luis; Alonso-Betanzos, Amparo; Bolón-Canedo, Verónica; Guijarro-Berdiñas, Bertha (2022). A TripAdvisor Dataset for Dyadic Context Analysis [Dataset]. https://portalinvestigacion.udc.gal/documentos/668fc448b9e7c03b01bd8a9b
    Explore at:
    Dataset updated
    2022
    Authors
    López-Riobóo Botana, Iñigo Luis; Alonso-Betanzos, Amparo; Bolón-Canedo, Verónica; Guijarro-Berdiñas, Bertha; López-Riobóo Botana, Iñigo Luis; Alonso-Betanzos, Amparo; Bolón-Canedo, Verónica; Guijarro-Berdiñas, Bertha
    Description

    There are many contexts where dyadic data are present. In social networks, users are linked to a variety of items, defining interactions. In the social platform of TripAdvisor, users are linked to restaurants by means of reviews posted by them. Using the information of these interactions, we can get valuable insights for forecasting, proposing tasks related to recommender systems, sentiment analysis, text-based personalisation or text summarisation, among others. Furthermore, in the context of TripAdvisor there is a scarcity of public datasets and lack of well-known benchmarks for model assessment. We present six new TripAdvisor datasets from the restaurants of six different cities: London, New York, New Delhi, Paris, Barcelona and Madrid. If you use this data, please cite the following paper under submission process (preprint - arXiv) We exclusively collected the reviews written in English from the restaurants of each city. The tabular data is comprised of a set of six different CSV files, containing numerical, categorical and text features: parse_count: numerical (integer), corresponding number of extracted review by the web scraper (auto-incremental) author_id: categorical (string), univocal, incremental and anonymous identifier of the user (UID_XXXXXXXXXX) restaurant_name: categorical (string), name of the restaurant matching the review rating_review: numerical (integer), review score in the range 1-5 sample: categorical (string), indicating “positive” sample for scores 4-5 and “negative” for scores 1-3 review_id: categorical (string), univocal and internal identifier of the review (review_XXXXXXXXX) title_review: text, review title review_preview: text, preview of the review, truncated in the website when the text is very long review_full: text, complete review date: timestamp, publication date of the review in the format (day, month, year) city: categorical (string), city of the restaurant which the review was written for url_restaurant: text, restaurant url

  16. f

    Accuracies for Yelp restaurant dataset with 100.000 reviews.

    • plos.figshare.com
    xls
    Updated Apr 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ali Erkan; Tunga Güngör (2024). Accuracies for Yelp restaurant dataset with 100.000 reviews. [Dataset]. http://doi.org/10.1371/journal.pone.0299264.t009
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Apr 4, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Ali Erkan; Tunga Güngör
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Accuracies for Yelp restaurant dataset with 100.000 reviews.

  17. f

    Statistics of restaurant dataset.

    • plos.figshare.com
    xls
    Updated Sep 20, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shihab Ahmed; Moythry Manir Samia; Maksuda Haider Sayma; Md. Mohsin Kabir; M. F. Mridha (2024). Statistics of restaurant dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0308050.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Sep 20, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Shihab Ahmed; Moythry Manir Samia; Maksuda Haider Sayma; Md. Mohsin Kabir; M. F. Mridha
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In recent years, the surge in reviews and comments on newspapers and social media has made sentiment analysis a focal point of interest for researchers. Sentiment analysis is also gaining popularity in the Bengali language. However, Aspect-Based Sentiment Analysis is considered a difficult task in the Bengali language due to the shortage of perfectly labeled datasets and the complex variations in the Bengali language. This study used two open-source benchmark datasets of the Bengali language, Cricket, and Restaurant, for our Aspect-Based Sentiment Analysis task. The original work was based on the Random Forest, Support Vector Machine, K-Nearest Neighbors, and Convolutional Neural Network models. In this work, we used the Bidirectional Encoder Representations from Transformers, the Robustly Optimized BERT Approach, and our proposed hybrid transformative Random Forest and Bidirectional Encoder Representations from Transformers (tRF-BERT) models to compare the results with the existing work. After comparing the results, we can clearly see that all the models used in our work achieved better results than any of the previous works on the same dataset. Amongst them, our proposed transformative Random Forest and Bidirectional Encoder Representations from Transformers achieved the highest F1 score and accuracy. The accuracy and F1 score of aspect detection for the Cricket dataset were 0.89 and 0.85, respectively, and for the Restaurant dataset were 0.92 and 0.89 respectively.

  18. o

    CONTEXTUAL RECOMMENDATION SYSTEM FOR LOCAL BUSINESSES

    • osf.io
    Updated May 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IAEME; Mayukh Maitra; Surabhi Sinha (2023). CONTEXTUAL RECOMMENDATION SYSTEM FOR LOCAL BUSINESSES [Dataset]. http://doi.org/10.17605/OSF.IO/GZAY2
    Explore at:
    Dataset updated
    May 25, 2023
    Dataset provided by
    Center For Open Science
    Authors
    IAEME; Mayukh Maitra; Surabhi Sinha
    Description

    ABSTRACT Going through several reviews could be laborious when this has to be done for multiple restaurants. One could instead read a graphical representation of what is great at the restaurant. Currently on Yelp, the food recommendations are only based on the total number of mentions of the food item in the reviews. Higher mentions, irrespective of the context, get an up-vote toward recommended items. Including context from reviews and tips could greatly improve the list of recommended items. In this project, we combine Named Entity Recognition and Sentiment Analysis of reviews. Based on the sentiment of the reviews we aim to suggest the best dishes of a restaurant or the best restaurant offering a dish. We have leveraged various feature engineering methods to produce state-of-the-art results. We established that if chosen, the appropriate feature vectors can significantly improve the classification performance. Fine-tuning BERT and bi-directional LSTM are producing better results than the machine learning models and if trained for more epochs can eventually prove to be the best classifier models. Keywords: Contextual Recommendation, Named Entity Recognition, BERT, LSTM, Count Vectors, TF-IDF

  19. A

    ‘SemEval 2014 Task 4: AspectBasedSentimentAnalysis’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Jan 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘SemEval 2014 Task 4: AspectBasedSentimentAnalysis’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-semeval-2014-task-4-aspectbasedsentimentanalysis-d634/322bfe9d/?iid=004-095&v=presentation
    Explore at:
    Dataset updated
    Jan 28, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘SemEval 2014 Task 4: AspectBasedSentimentAnalysis’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/charitarth/semeval-2014-task-4-aspectbasedsentimentanalysis on 28 January 2022.

    --- Dataset description provided by original source is as follows ---

    Copied from https://alt.qcri.org/semeval2014/task4/#, all credits to respective authors.

    SemEval-2014 Task 4

    Task Description: Aspect Based Sentiment Analysis (ABSA)

    Sentiment analysis is increasingly viewed as a vital task both from an academic and a commercial standpoint. The majority of current approaches, however, attempt to detect the overall polarity of a sentence, paragraph, or text span, regardless of the entities mentioned (e.g., laptops, restaurants) and their aspects (e.g., battery, screen; food, service). By contrast, this task is concerned with aspect based sentiment analysis (ABSA), where the goal is to identify the aspects of given target entities and the sentiment expressed towards each aspect. Datasets consisting of customer reviews with human-authored annotations identifying the mentioned aspects of the target entities and the sentiment polarity of each aspect will be provided.

    In particular, the task consists of the following subtasks:

    Subtask 1: Aspect term extraction

    Given a set of sentences with pre-identified entities (e.g., restaurants), identify the aspect terms present in the sentence and return a list containing all the distinct aspect terms. An aspect term names a particular aspect of the target entity.

    For example, "I liked the service and the staff, but not the food”, “The food was nothing much, but I loved the staff”. Multi-word aspect terms (e.g., “hard disk”) should be treated as single terms (e.g., in “The hard disk is very noisy” the only aspect term is “hard disk”).

    Subtask 2: Aspect term polarity

    For a given set of aspect terms within a sentence, determine whether the polarity of each aspect term is positive, negative, neutral or conflict (i.e., both positive and negative).

    For example:

    “I loved their fajitas” → {fajitas: positive} “I hated their fajitas, but their salads were great” → {fajitas: negative, salads: positive} “The fajitas are their first plate” → {fajitas: neutral} “The fajitas were great to taste, but not to see” → {fajitas: conflict}

    Subtask 3: Aspect category detection

    Given a predefined set of aspect categories (e.g., price, food), identify the aspect categories discussed in a given sentence. Aspect categories are typically coarser than the aspect terms of Subtask 1, and they do not necessarily occur as terms in the given sentence.

    For example, given the set of aspect categories {food, service, price, ambience, anecdotes/miscellaneous}:

    “The restaurant was too expensive” → {price} “The restaurant was expensive, but the menu was great” → {price, food}

    Subtask 4: Aspect category polarity

    Given a set of pre-identified aspect categories (e.g., {food, price}), determine the polarity (positive, negative, neutral or conflict) of each aspect category.

    For example:

    “The restaurant was too expensive” → {price: negative} “The restaurant was expensive, but the menu was great” → {price: negative, food: positive}

    Datasets:

    Two domain-specific datasets for laptops and restaurants, consisting of over 6K sentences with fine-grained aspect-level human annotations have been provided for training.

    Restaurant reviews:

    This dataset consists of over 3K English sentences from the restaurant reviews of Ganu et al. (2009). The original dataset of Ganu et al. included annotations for coarse aspect categories (Subtask 3) and overall sentence polarities; we modified the dataset to include annotations for aspect terms occurring in the sentences (Subtask 1), aspect term polarities (Subtask 2), and aspect category-specific polarities (Subtask 4). We also corrected some errors (e.g., sentence splitting errors) of the original dataset. Experienced human annotators identified the aspect terms of the sentences and their polarities (Subtasks 1 and 2). Additional restaurant reviews, not in the original dataset of Ganu et al. (2009), are being annotated in the same manner, and they will be used as test data.

    Laptop reviews:

    This dataset consists of over 3K English sentences extracted from customer reviews of laptops. Experienced human annotators tagged the aspect terms of the sentences (Subtask 1) and their polarities (Subtask 2). This dataset will be used only for Subtasks 1 and 2. Part of this dataset will be reserved as test data.

    Dataset format:

    The sentences in the datasets are annotated using XML tags.

    The following example illustrates the format of the annotated sentences of the restaurants dataset. ```xml

     
    
    The possible values of the polarity field are: “positive”, “negative”, “conflict”, “neutral”. The possible values of the category field are: “food”, “service”, “price”, “ambience”, “anecdotes/miscellaneous”.
    
     
    
    The following example illustrates the format of the annotated sentences of the laptops dataset. The format is the same as in the restaurant datasets, with the only exception that there are no annotations for aspect categories. Notice that we annotate only aspect terms naming particular aspects (e.g., “everything about it” does not name a particular aspect).
    
     
    ```xml
    
    

    In the sentences of both datasets, there is an

    --- Original source retains full ownership of the source dataset ---

  20. m

    KurdABSA: Aspect Based Sentiment Analysis Dataset for Kurdish Language

    • data.mendeley.com
    Updated Aug 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rania Azad (2025). KurdABSA: Aspect Based Sentiment Analysis Dataset for Kurdish Language [Dataset]. http://doi.org/10.17632/h5t7p4bcj2.1
    Explore at:
    Dataset updated
    Aug 6, 2025
    Authors
    Rania Azad
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset is the first publicly available aspect-based sentiment analysis dataset for the Sorani dialect of Kurdish, addressing a critical gap in natural language processing (NLP) research for low-resource languages. The dataset comprised more than 4000 quadruplet ABSA in the restaurant review domain, written in the Kurdish language (Sorani dialect) using the Perso-Arabic script. The dataset was automatically annotated using a few-shot and prompt based model. This resource is intended for use in machine learning, deep learning, and cross-lingual model adaptation, making it suitable for training, fine-tuning, and benchmarking.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Ehsanur Rahman Rhythm (2024). BDFoodSent: A Large-Scale Sentiment-Labeled Restaurant Review Dataset from Bangladesh [Dataset]. http://doi.org/10.17632/532fxhnwbb.2

BDFoodSent: A Large-Scale Sentiment-Labeled Restaurant Review Dataset from Bangladesh

Explore at:
Dataset updated
Dec 2, 2024
Authors
Ehsanur Rahman Rhythm
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Area covered
Bangladesh
Description

BDFoodReview is a large-scale dataset containing 334,119 restaurant reviews collected from "Foodpanda Bangladesh". The dataset includes customer reviews in mixed languages (Bangla, English, and Banglish), translated into English, along with their corresponding ratings and sentiment labels.

Dataset Statistics Total Reviews: 334,119 Features/Columns: 19

Potential Applications Sentiment Analysis Restaurant Review Classification Customer Satisfaction Analysis Opinion Mining Natural Language Processing Research Food Service Industry Analysis

Search
Clear search
Close search
Google apps
Main menu