63 datasets found
  1. m

    RevBangla: Bangla Product Sentiment Analysis Dataset

    • data.mendeley.com
    Updated Mar 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Saieef Sarower Sunny (2024). RevBangla: Bangla Product Sentiment Analysis Dataset [Dataset]. http://doi.org/10.17632/bnbbcdsf4m.1
    Explore at:
    Dataset updated
    Mar 6, 2024
    Authors
    Saieef Sarower Sunny
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Bangla Product Comments Dataset is a comprehensive collection of product reviews gathered from diverse ecommerce platforms in Bangladesh. This dataset offers a rich source of information reflecting customer opinions and sentiments towards various products available online. This dataset holds significant value for businesses, researchers, and data scientists interested in understanding consumer behavior, product perception, and sentiment analysis within the Bangladeshi ecommerce landscape. By leveraging this dataset, stakeholders can derive actionable insights to enhance product quality, marketing strategies, and overall customer satisfaction.

    Columns:

    1. Product_ID: A unique identifier for each product, facilitating organization and referencing.
    2. Date: The date when the comment was posted, providing temporal context for analysis.
    3. Customer Name: The name or identifier of the customer who submitted the comment, ensuring traceability and potential user segmentation.
    4. Rating: A numerical representation (typically on a scale of 1 to 5) reflecting the customer's overall satisfaction level with the product.
    5. Label Sentiment: A categorical label assigned to each comment indicating the sentiment expressed by the customer (e.g., positive, negative). This classification facilitates sentiment analysis tasks.
    6. Comment: The actual text of the customer's review or comment, conveying specific opinions, feedback, or experiences regarding the product.
  2. Bangla Dataset on Youtube Political Comments |NLP

    • kaggle.com
    Updated Nov 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Durjoy Chandra Paul (2024). Bangla Dataset on Youtube Political Comments |NLP [Dataset]. https://www.kaggle.com/datasets/durjoychandrapaul/bangla-political-comments-dataset-for-nlp-tasks
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 3, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Durjoy Chandra Paul
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    YouTube
    Description

    This dataset consists of YouTube comments predominantly collected from political news videos relevant to Bangladesh. The comments are written in Bengali, enriched with emojis that express a range of emotions and opinions. These comments provide unique insights into the public sentiment and reactions related to political events, figures, and policies within the country. This dataset can be highly useful for NLP tasks such as sentiment analysis, emotion detection, and opinion mining. It enables researchers to study public sentiment, emotional expression, and political opinions through text and emojis in Bengali.

    Key Features:

    Language: The comments are in Bengali, reflecting authentic language use with local expressions and cultural nuances.

    Emojis: The presence of emojis in the dataset helps capture non-verbal cues and emotional expressions that add depth to the textual sentiment.

    Context: The data is sourced from videos specifically focused on political news, making it valuable for research related to social, political, and media analysis in Bangladesh.

  3. Bengali News Comments Sentiment

    • kaggle.com
    Updated Nov 26, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mobassir (2020). Bengali News Comments Sentiment [Dataset]. https://www.kaggle.com/mobassir/bengali-news-comments-sentiment/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 26, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Mobassir
    Description

    Context

    Data Set For Sentiment Analysis On Bengali News Comments

    Content

    This is a data set of Sentiment Analysis On Bangla News Comments where every data was annotated by three different individuals to get three different perspectives and based on the majorities decisions the final tag was chosen. This data set contains 13802 data in total.

    Acknowledgements

    https://data.mendeley.com/datasets/n53xt69gnf/2

    Inspiration

    aiming to improve bengali and romanic bangla nlp works

  4. h

    bn_code_mix_sentiment_dataset

    • huggingface.co
    Updated Sep 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Swarnadeep Das (2025). bn_code_mix_sentiment_dataset [Dataset]. https://huggingface.co/datasets/Swarnadeep-28/bn_code_mix_sentiment_dataset
    Explore at:
    Dataset updated
    Sep 10, 2025
    Authors
    Swarnadeep Das
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Bengali-English Code-Mixed Sentiment Dataset

      Dataset Summary
    

    This dataset contains Bengali–English code-mixed social media text annotated for sentiment classification.The primary goal is to support research and applications in code-mixed NLP, especially sentiment analysis in low-resource Indic languages. The dataset combines and cleans multiple publicly available sources:

    BnSentMix: Bengali–English code-mixed sentiment dataset
    SentMix-3L: Multi-lingual code-mixed… See the full description on the dataset page: https://huggingface.co/datasets/Swarnadeep-28/bn_code_mix_sentiment_dataset.

  5. m

    BANGLA-ABSA: Unique Aspect Based Sentiment Analysis datasets in Bangla...

    • data.mendeley.com
    Updated Jul 9, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mahmudul Hasan (2024). BANGLA-ABSA: Unique Aspect Based Sentiment Analysis datasets in Bangla Language [Dataset]. http://doi.org/10.17632/998m4jy3m9.3
    Explore at:
    Dataset updated
    Jul 9, 2024
    Authors
    Mahmudul Hasan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In the Bangla language, sentiment analysis is becoming more and more significant. Aspect-based sentiment analysis (ABSA) predicts the sentiment polarity on an aspect level. The data were collected from numerous individuals with a minimum of two aspects. Every comment is a complex or compound sentence. The datasets are organized in a folder named "BANGLA_ABSA dataset" which has four Excel files, one for each of the datasets: Car_ABSA, Mobile_phone_ABSA, Movie_ABSA, and Restaurant_ABSA. Each Excel file contains three columns namely Id, Comment, and {Aspect category, Sentiment Polarity}. Car_ABSA, Mobile_phone_ABSA, Movie_ABSA, and Restaurant_ABSA datasets have 1149, 975, 800, and 801 rows of data respectively.

  6. n

    Bangla Bengali sentiment lexicon dictionary with positive and negative words...

    • narcis.nl
    • data.mendeley.com
    Updated Mar 9, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sazzed, S (via Mendeley Data) (2021). Bangla Bengali sentiment lexicon dictionary with positive and negative words [Dataset]. http://doi.org/10.17632/zggnjpnmwp.2
    Explore at:
    Dataset updated
    Mar 9, 2021
    Dataset provided by
    Data Archiving and Networked Services (DANS)
    Authors
    Sazzed, S (via Mendeley Data)
    Description

    This dataset contains around 1300 positive and negative Bengal ( Bangla ) sentiment words. This lexicon was created from a Bengali review corpus.

    If you use this lexicon please cite following paper-

    @inproceedings{sazzed2020development, title={Development of Sentiment Lexicon in Bengali utilizing Corpus and Cross-lingual Resources}, author={Sazzed, Salim}, booktitle={2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science (IRI)}, pages={237--244}, year={2020}, organization={IEEE Computer Society} }

    https://www.cs.odu.edu/~ssazzed/IEEE_IRI_2020.pdf

  7. h

    synthetic-bengali-sentiment

    • huggingface.co
    Updated Aug 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    shaikh R (2025). synthetic-bengali-sentiment [Dataset]. http://doi.org/10.57967/hf/5762
    Explore at:
    Dataset updated
    Aug 1, 2025
    Authors
    shaikh R
    License

    https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/

    Description

    Bengali Sentiment Analysis Dataset

      Dataset Description
    

    This dataset contains 44,236 Bengali sentences with corresponding sentiment labels, synthetically generated using ChatGPT for natural language processing and machine learning research.

      Dataset Summary
    

    Language: Bengali (বাংলা) Total Entries: 44,236 synthetic sentences Task: Sentiment Classification Format: JSON Generation Method: OpenAI ChatGPT (GPT-4) License: CC0 1.0 Universal (Public Domain)… See the full description on the dataset page: https://huggingface.co/datasets/shaikh25/synthetic-bengali-sentiment.

  8. f

    Bangla (Bengali) Drama Review Dataset

    • figshare.com
    txt
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    salim sazzed (2023). Bangla (Bengali) Drama Review Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.13162085.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    figshare
    Authors
    salim sazzed
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The repository contains 3307 Negative reviews and 8500 Positive reviews collected and manually annotated from Youtube Bengali drama.If you use this dataset, please cite the following paper-@inproceedings{sazzed2020cross,title={Cross-lingual sentiment classification in low-resource Bengali language},author={Sazzed, Salim},booktitle={Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020)},pages={50--60},year={2020}

    }If you have any questions, please email me- salimsazzad222@gmail.com.

  9. h

    bengali_sentiment_analysis

    • huggingface.co
    Updated Aug 7, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Akash Kundu (2024). bengali_sentiment_analysis [Dataset]. https://huggingface.co/datasets/Akash190104/bengali_sentiment_analysis
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 7, 2024
    Authors
    Akash Kundu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Bengali Sentiment Analysis

      Context
    

    The dataset contains 3307 Negative reviews and 8500 Positive reviews collected and manually annotated from Youtube Bengali drama. Positive_Label=1 and Negative_Label=0

      Acknowledgements
    

    Sazzed, Salim (2021), “Bangla ( Bengali ) sentiment analysis classification benchmark dataset corpus”, Mendeley Data, V4, doi: 10.17632/p6zc7krs37.4

  10. m

    Bengali Political Sentiment Analysis Dataset

    • data.mendeley.com
    Updated Oct 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adib Mahmud (2025). Bengali Political Sentiment Analysis Dataset [Dataset]. http://doi.org/10.17632/x5yc4m5yg2.2
    Explore at:
    Dataset updated
    Oct 2, 2025
    Authors
    Adib Mahmud
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset comprises 3,290 Bengali political comments sourced from social media platforms, news comment sections, and online political discussions, specifically curated for sentiment analysis research in Bengali NLP. The corpus provides a comprehensive resource for training and evaluating sentiment classification models within the political domain. The dataset features 3,290 instances distributed across five sentiment classes with excellent balance (variance <8%): Very Negative (675, 20.5%), Negative (663, 20.2%), Neutral (626, 19.0%), Very Positive (664, 20.2%), and Positive (662, 20.1%). Stored in Excel format with two columns containing Bengali political comments (Unicode text) and corresponding sentiment labels, the dataset maintains high quality with no missing values and verified annotations. Comment lengths average 83 characters, ranging from 11 to 398 characters. The collection encompasses diverse political discourse including government policies and governance, electoral processes and democracy, political parties and leadership dynamics, social and economic issues, current affairs and political events, along with public opinion and citizen responses to political developments. This dataset serves multiple research purposes, including Bengali sentiment analysis model development and benchmarking, political discourse analysis and opinion mining, natural language processing research for low-resource languages, cross-lingual sentiment analysis studies, social media analytics for Bengali content, multi-class text classification research, and comparative political sentiment studies across different linguistic and cultural contexts.

  11. BAN-ABSA

    • kaggle.com
    Updated Oct 3, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mahfuz Ahmed Masum (2020). BAN-ABSA [Dataset]. https://www.kaggle.com/mahfuzahmed/banabsa/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 3, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Mahfuz Ahmed Masum
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Mahfuz Ahmed Masum

    Released under CC0: Public Domain

    Contents

  12. Z

    Bengali Identity Bias Evaluation Dataset (BIBED)

    • data.niaid.nih.gov
    Updated Aug 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Das, Dipto (2023). Bengali Identity Bias Evaluation Dataset (BIBED) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7775520
    Explore at:
    Dataset updated
    Aug 7, 2023
    Dataset provided by
    Das, Dipto
    Semaan, Bryan
    Guha, Shion
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Critical studies found NLP systems to bias based on gender and racial identities. However, few studies focused on identities defined by cultural factors like religion and nationality. Compared to English, such research efforts are even further limited in major languages like Bengali due to the unavailability of labeled datasets. Our paper (see the reference) describes a process for developing a bias evaluation dataset highlighting cultural influences on identity. We also provide this Bengali dataset as an artifact outcome that can contribute to future critical research.

    If you find this dataset useful, please cite the associated paper:

    Das, D., Guha, S., & Semaan, B. (2023, May). Toward Cultural Bias Evaluation Datasets: The Case of Bengali Gender, Religious, and National Identity. In Proceedings of the First Workshop on Cross-Cultural Considerations in NLP (C3NLP) (pp. 68-83).

    BibTeX:

    @inproceedings{das-etal-2023-toward, title = "Toward Cultural Bias Evaluation Datasets: The Case of {B}engali Gender, Religious, and National Identity", author = "Das, Dipto and Guha, Shion and Semaan, Bryan", booktitle = "Proceedings of the First Workshop on Cross-Cultural Considerations in NLP (C3NLP)", month = may, year = "2023", address = "Dubrovnik, Croatia", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2023.c3nlp-1.8", pages = "68--83", }

  13. Bangla Sentiment Analysis for Online Gaming.

    • kaggle.com
    Updated May 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Saiful Islam (2024). Bangla Sentiment Analysis for Online Gaming. [Dataset]. https://www.kaggle.com/datasets/saifulislamtarek/bangla-sentiment-analysis-for-online-gaming
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 16, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Saiful Islam
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    The most recent Natural Language Processing (NLP) method for ascertaining a user's sentiment is sentiment analysis. Online gaming is one of the activities that people of all ages, especially young people, are being forced to engage in as a result of the recent COVID-19 pandemic. Since smartphones have made it easy for people to access the internet, the number of people playing online games has increased. This research study has used various machines learning classification algorithms from over 401 data points in an attempt to investigate online gaming addiction. All age groups are taken into account when gathering data, but students in high school, college, and university are given special consideration.

    Data Collection :**

    This section identifies the different category of Bengali language. Here, two different parameters are considered. First one is "Class" and another one is "Opinions". The main focus of the proposed model is to get the user feedback on online game addiction, analyze the user data and identify the types of datasets accordingly. As a result, the proposed dataset is restricted to collecting 401 text documents and in only two columns. One is paragraph or text form and another one is classification. Paragraph or text means positive ,negative and neutral on which the class will be labelled on the other side, it means the 'opinions' of user about the online gaming addiction. ----more details of my paper: DOI : http://dx.doi.org/10.1109/I-SMAC55078.2022.9987343

  14. f

    Sentiment classification for restaurant dataset.

    • figshare.com
    xls
    Updated Sep 20, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shihab Ahmed; Moythry Manir Samia; Maksuda Haider Sayma; Md. Mohsin Kabir; M. F. Mridha (2024). Sentiment classification for restaurant dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0308050.t010
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Sep 20, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Shihab Ahmed; Moythry Manir Samia; Maksuda Haider Sayma; Md. Mohsin Kabir; M. F. Mridha
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In recent years, the surge in reviews and comments on newspapers and social media has made sentiment analysis a focal point of interest for researchers. Sentiment analysis is also gaining popularity in the Bengali language. However, Aspect-Based Sentiment Analysis is considered a difficult task in the Bengali language due to the shortage of perfectly labeled datasets and the complex variations in the Bengali language. This study used two open-source benchmark datasets of the Bengali language, Cricket, and Restaurant, for our Aspect-Based Sentiment Analysis task. The original work was based on the Random Forest, Support Vector Machine, K-Nearest Neighbors, and Convolutional Neural Network models. In this work, we used the Bidirectional Encoder Representations from Transformers, the Robustly Optimized BERT Approach, and our proposed hybrid transformative Random Forest and Bidirectional Encoder Representations from Transformers (tRF-BERT) models to compare the results with the existing work. After comparing the results, we can clearly see that all the models used in our work achieved better results than any of the previous works on the same dataset. Amongst them, our proposed transformative Random Forest and Bidirectional Encoder Representations from Transformers achieved the highest F1 score and accuracy. The accuracy and F1 score of aspect detection for the Cricket dataset were 0.89 and 0.85, respectively, and for the Restaurant dataset were 0.92 and 0.89 respectively.

  15. m

    Data from: BanglaSarc3: A Benchmark Dataset for Bangla Sarcasm Detection...

    • data.mendeley.com
    Updated Feb 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Susmoy Biswas (2025). BanglaSarc3: A Benchmark Dataset for Bangla Sarcasm Detection from Social Media to Advance Bangla NLP [Dataset]. http://doi.org/10.17632/7tn76wdhsr.1
    Explore at:
    Dataset updated
    Feb 24, 2025
    Authors
    Susmoy Biswas
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BanglaSarc3 dataset serves as a benchmark resource for sarcasm classification in Bangla, ensuring balanced category representation. The primary objective of BanglaSarc3 is to mitigate humor misinterpretation that often leads to digital conflicts and misunderstandings in online communication. To enhance dataset quality, preprocessing steps such as anonymization, duplicate removal, and text normalization were applied. Additionally, three native Bangla speakers independently reviewed and validated the labels, ensuring annotation reliability.

    BanglaSarc3 introduce BanglaSarc3, a ternary-class dataset containing 12,089 Facebook comments, categorized as follows: - Neutral: 4,056 comments - Sarcastic: 4,012 comments - Non-Sarcastic: 4,021 comments

    The BanglaSarc3 dataset has significant implications across multiple NLP and AI domains, including: 1. Sarcasm Detection in Bangla Social Media 2. Sentiment and Emotion Analysis 3. Language Modeling and BNLP Advancements 4. Explainable AI (XAI) in Bangla NLP 5. Educational and Research Applications

    The BanglaSarc3 dataset is openly available for academic and research purposes, fostering collaboration and innovation within the Bangla NLP community. By providing a robust foundation for sarcasm classification, this dataset aims to drive advancements in Bangla-centric AI applications, ensuring more inclusive and context-aware language models.

  16. RBE-Sent

    • kaggle.com
    Updated Sep 26, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dalia Barua (2025). RBE-Sent [Dataset]. https://www.kaggle.com/datasets/baruadalia/rbe-sent
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 26, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Dalia Barua
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    RBE_Sent Dataset Description:

    The RBE_Sent (Roman Bengali-English Sentiment) dataset is a synthetic, gold-standard code-mixed dataset developed for sentiment analysis tasks involving Romanized Bengali and English. It captures real-world bilingual usage by blending Roman Bengali with English tokens within the same textual instances. The dataset is designed to support research in multilingual natural language processing, especially in the context of low-resource, code-mixed languages. Each entry in RBE_Sent is annotated for sentiment, enabling supervised learning and evaluation. By providing a structured and labeled resource in this underexplored linguistic domain, RBE_Sent contributes to advancing computational methods for understanding Bengali-English code-mixed communication.

    This Dataset contains Bengali matrix codemixed product review text. Dataset has 18086 roman Bengali-English codemixed product reviews .

    license: mit language:

    bn en tags: codemixed product review ecommerce bengali dataset pretty_name:

  17. m

    Data from: ANUBHUTI: A COMPREHENSIVE CORPUS FOR SENTIMENT ANALYSIS IN BANGLA...

    • data.mendeley.com
    Updated Jun 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Swastika Kundu (2025). ANUBHUTI: A COMPREHENSIVE CORPUS FOR SENTIMENT ANALYSIS IN BANGLA REGIONAL LANGUAGES [Dataset]. http://doi.org/10.17632/mjxwby94yw.1
    Explore at:
    Dataset updated
    Jun 26, 2025
    Authors
    Swastika Kundu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ANUBHUTI, a comprehensive dataset consisting of 2,000 sentences manually translated from standard Bangla into four major regional dialects—Mymensingh, Noakhali, Sylhet, and Chittagong. The dataset predominantly features political and religious content, reflecting the contemporary socio-political landscape of Bangladesh, alongside neutral texts to maintain balance. Each sentence is annotated using a dual annotation scheme: (i) multiclass thematic labeling categorizes sentences as Political, Religious, or Neutral, and (ii) multilabel emotion annotation assigns one or more emotions from Anger, Contempt, Disgust, Enjoyment, Fear, Sadness, and Surprise.

  18. f

    Sentiment classification for cricket dataset.

    • plos.figshare.com
    xls
    Updated Sep 20, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shihab Ahmed; Moythry Manir Samia; Maksuda Haider Sayma; Md. Mohsin Kabir; M. F. Mridha (2024). Sentiment classification for cricket dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0308050.t008
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Sep 20, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Shihab Ahmed; Moythry Manir Samia; Maksuda Haider Sayma; Md. Mohsin Kabir; M. F. Mridha
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In recent years, the surge in reviews and comments on newspapers and social media has made sentiment analysis a focal point of interest for researchers. Sentiment analysis is also gaining popularity in the Bengali language. However, Aspect-Based Sentiment Analysis is considered a difficult task in the Bengali language due to the shortage of perfectly labeled datasets and the complex variations in the Bengali language. This study used two open-source benchmark datasets of the Bengali language, Cricket, and Restaurant, for our Aspect-Based Sentiment Analysis task. The original work was based on the Random Forest, Support Vector Machine, K-Nearest Neighbors, and Convolutional Neural Network models. In this work, we used the Bidirectional Encoder Representations from Transformers, the Robustly Optimized BERT Approach, and our proposed hybrid transformative Random Forest and Bidirectional Encoder Representations from Transformers (tRF-BERT) models to compare the results with the existing work. After comparing the results, we can clearly see that all the models used in our work achieved better results than any of the previous works on the same dataset. Amongst them, our proposed transformative Random Forest and Bidirectional Encoder Representations from Transformers achieved the highest F1 score and accuracy. The accuracy and F1 score of aspect detection for the Cricket dataset were 0.89 and 0.85, respectively, and for the Restaurant dataset were 0.92 and 0.89 respectively.

  19. m

    Advancing Bengali NLP for Sentiment and Emotion Dataset

    • data.mendeley.com
    Updated Jan 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rownuk Ara Rumy (2025). Advancing Bengali NLP for Sentiment and Emotion Dataset [Dataset]. http://doi.org/10.17632/kztpv8g89p.1
    Explore at:
    Dataset updated
    Jan 27, 2025
    Authors
    Rownuk Ara Rumy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset consists of 34,812 Bengali posts and comments sourced from Facebook, Twitter, and Instagram, Bengali news portals and literature. Techniques employed in data acquisition included data scraping from social media accounts through API and scraping only text data from websites. Microblogs consist of posts and comments from platforms like Facebook, Twitter, and Instagram, which allow for the capture of informal and emotionally rich text. Newspaper and magazine articles provide formal, sentiment-related information through opinions. Online literature, including Bengali novels, poems, and blogs, incorporates semantic relationships and linguistic nuances. Text data is collected from public sources through automated scripts. We used selenium scripts, created using the Python programming language. We used APIs to obtain structured social media data. Additionally, we complied with the requirements of privacy, data collection, and ethics.It contains 5 Emotion and 5 Sentiment class. For emotion "Creepy" being the most frequent emotion with 12,000 entries, followed by "Unbiased" with 8,500 entries, "Joyful" with 7,500 entries, "Bullying" with 4,000 entries, and "Surprise" with 2,500 entries. On the other hand, for sentiment "Negative" being the most frequent with 8,000 entries, followed by "Neutral" with 7,000 entries, "Strongly Negative" with 6,800 entries, "Positive" with 5,500 entries, and "Strongly Positive" with 4,500 entries in that order.

  20. h

    Cricket_Sentiment

    • huggingface.co
    Updated Jul 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tanjim Mahmud (2025). Cricket_Sentiment [Dataset]. http://doi.org/10.57967/hf/5982
    Explore at:
    Dataset updated
    Jul 15, 2025
    Authors
    Tanjim Mahmud
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Published Paper Information:>>>>>>>>>>>>>>>>>>>>>>>> If you use this dataset, please cite the following paper:

    @article{mahmud2024benchmark, title={A benchmark dataset for cricket sentiment analysis in bangla social media text}, author={Mahmud, Tanjim and Karim, Rezaul and Chakma, Rishita and Chowdhury, Tanjia and Hossain, Mohammad Shahadat and Andersson, Karl}, journal={Procedia Computer Science}, volume={238}, pages={377--384}, year={2024}, publisher={Elsevier} } ,,,

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Saieef Sarower Sunny (2024). RevBangla: Bangla Product Sentiment Analysis Dataset [Dataset]. http://doi.org/10.17632/bnbbcdsf4m.1

RevBangla: Bangla Product Sentiment Analysis Dataset

Explore at:
Dataset updated
Mar 6, 2024
Authors
Saieef Sarower Sunny
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The Bangla Product Comments Dataset is a comprehensive collection of product reviews gathered from diverse ecommerce platforms in Bangladesh. This dataset offers a rich source of information reflecting customer opinions and sentiments towards various products available online. This dataset holds significant value for businesses, researchers, and data scientists interested in understanding consumer behavior, product perception, and sentiment analysis within the Bangladeshi ecommerce landscape. By leveraging this dataset, stakeholders can derive actionable insights to enhance product quality, marketing strategies, and overall customer satisfaction.

Columns:

  1. Product_ID: A unique identifier for each product, facilitating organization and referencing.
  2. Date: The date when the comment was posted, providing temporal context for analysis.
  3. Customer Name: The name or identifier of the customer who submitted the comment, ensuring traceability and potential user segmentation.
  4. Rating: A numerical representation (typically on a scale of 1 to 5) reflecting the customer's overall satisfaction level with the product.
  5. Label Sentiment: A categorical label assigned to each comment indicating the sentiment expressed by the customer (e.g., positive, negative). This classification facilitates sentiment analysis tasks.
  6. Comment: The actual text of the customer's review or comment, conveying specific opinions, feedback, or experiences regarding the product.
Search
Clear search
Close search
Google apps
Main menu