100+ datasets found
  1. Sentiment Dataset (Bangla Text)

    • kaggle.com
    zip
    Updated Jan 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tasrif Nur Himel (2024). Sentiment Dataset (Bangla Text) [Dataset]. https://www.kaggle.com/datasets/tasrifnurhimel/sentiment-dataset-bangla-text
    Explore at:
    zip(639787 bytes)Available download formats
    Dataset updated
    Jan 2, 2024
    Authors
    Tasrif Nur Himel
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    About this Dataset

    This dataset is designed for sentiment analysis tasks, specifically to classify text comments as positive or negative. It's a supervised dataset, meaning each comment is already labeled with its corresponding sentiment.

    Key Features:

    Two Columns: - Text: Contains the raw text of the comments. - Tag: Indicates the sentiment of the comment, labeled as either "positive" or "negative."

    Supervised Learning: Ideal for training and evaluating machine learning models for sentiment classification.

    Potential Applications: - Sentiment Analysis: Build models to automatically analyze emotions and opinions in various text data. - Social Media Analysis: Understand public sentiment towards brands, products, or topics on social media platforms. - Customer Feedback Analysis: Gauge customer satisfaction and identify areas for improvement based on reviews and feedback. - Text Classification: Develop text categorization systems for diverse applications.

  2. m

    Bangla Sentiment Dataset

    • data.mendeley.com
    Updated Jun 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jahanur Biswas (2025). Bangla Sentiment Dataset [Dataset]. http://doi.org/10.17632/rh67mckhbh.2
    Explore at:
    Dataset updated
    Jun 3, 2025
    Authors
    Jahanur Biswas
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Bangla Sentiment Dataset is a curated collection of sentiment-rich textual data in Bangla, focused on recent and trending topics. This dataset has been compiled from diverse sources, including Bangladeshi online newspapers, social media platforms, and blogs, ensuring a wide spectrum of language styles and sentiment expressions.

    Key Features: Focus on Recent Topics: The dataset emphasizes contemporary issues, trending discussions, and popular topics in Bangladeshi society. This includes sentiments on political developments, social movements, entertainment, cultural events, and other recent happenings.

    Source Variety:

    Online Newspapers: Articles, editorials, headlines, and reader comments provide structured and semi-formal sentiment data. Social Media: Posts, tweets, and comments reflect informal, conversational language with high emotional expressiveness. Blogs: Opinion pieces and discussions offer detailed and context-rich sentiment content. Sentiment Labels: Each entry in the dataset is annotated with one of the following sentiment categories:

    Positive (1): Texts expressing happiness, agreement, or optimism. Negative (0): Texts reflecting criticism, disagreement, or pessimism. Neutral (2): Texts presenting balanced or factual statements with minimal emotional bias. Linguistic and Stylistic Diversity: The dataset captures a range of Bangla language variations, including:

    Formal and informal Bangla usage. Regional dialects. Transliterated Bangla (Banglish) commonly used on social media. Real-World Context: The inclusion of recent topics ensures that the dataset is relevant for analyzing public sentiment around current events and trends. This makes it particularly useful for real-time sentiment analysis applications.

    This dataset provides an invaluable resource for researchers and practitioners aiming to explore sentiment analysis in Bangla, with a special emphasis on modern-day relevance and real-world applicability.

  3. Large Sentiment Analysis Bangla Dataset

    • kaggle.com
    zip
    Updated Jul 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mir Tahmid (2024). Large Sentiment Analysis Bangla Dataset [Dataset]. https://www.kaggle.com/datasets/tahmidmir/largesentimentdata
    Explore at:
    zip(275593 bytes)Available download formats
    Dataset updated
    Jul 27, 2024
    Authors
    Mir Tahmid
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This is a large-scale Bangla dataset based on positive, negative, and and neutral comments. It has four features: platform, where we get the comments; sources; comment; sentiment; and label.

    There are four columns which are Platform, Comment, Sentiment, and Label. I have collected Bangla comments from Twitter, Youtube, and Google. Comment is about positive, negative, and neutral. Sentiment is about making toxic, neutral, sad, funny, and happy comments that are labeled by 0, 1, 2, 3, and 4.

  4. Large Bangla Sentiment Dataset

    • kaggle.com
    zip
    Updated Jun 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Parthaa_ghosh (2024). Large Bangla Sentiment Dataset [Dataset]. https://www.kaggle.com/datasets/parthaaghosh/large-bangla-sentiment-dataset
    Explore at:
    zip(14689290 bytes)Available download formats
    Dataset updated
    Jun 6, 2024
    Authors
    Parthaa_ghosh
    Description

    This is a dataset for Bengali sentiment analysis which a merged dataset from the publicly available sentiment dataset. The sources I used to make a merged bangla sentiment dataset are: 1) https://www.kaggle.com/datasets/cryptexcode/sentnob-sentiment-analysis-in-noisy-bangla-texts 2) https://github.com/atik-05/Bangla_ABSA_Datasets/tree/master 3) https://data.mendeley.com/datasets/n53xt69gnf/3 4) https://github.com/shakkhor/Academic-Thesis/blob/master/450/comments.csv 5) https://github.com/mohsinulkabir14/BanglaBook/tree/main/data/csv After that, i applied some cleaning and preprocessing on this merge dataset. In the dataset, there are 2 columns. One is "Data" and another is "Label". There are 3 labels for sentiment labeling. 1) Neutral : 0 2) Positive : 1 3) Negative: 2

  5. h

    synthetic-bengali-sentiment

    • huggingface.co
    Updated Aug 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    shaikh R (2025). synthetic-bengali-sentiment [Dataset]. http://doi.org/10.57967/hf/5762
    Explore at:
    Dataset updated
    Aug 1, 2025
    Authors
    shaikh R
    License

    https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/

    Description

    Bengali Sentiment Analysis Dataset

      Dataset Description
    

    This dataset contains 44,236 Bengali sentences with corresponding sentiment labels, synthetically generated using ChatGPT for natural language processing and machine learning research.

      Dataset Summary
    

    Language: Bengali (বাংলা) Total Entries: 44,236 synthetic sentences Task: Sentiment Classification Format: JSON Generation Method: OpenAI ChatGPT (GPT-4) License: CC0 1.0 Universal (Public Domain)… See the full description on the dataset page: https://huggingface.co/datasets/shaikh25/synthetic-bengali-sentiment.

  6. m

    A Multimodal Bangla Text–Audio Dataset for Sentiment Analysis

    • data.mendeley.com
    Updated Dec 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Md. Darun Nayeem (2025). A Multimodal Bangla Text–Audio Dataset for Sentiment Analysis [Dataset]. http://doi.org/10.17632/5yb4jjzrx3.1
    Explore at:
    Dataset updated
    Dec 15, 2025
    Authors
    Md. Darun Nayeem
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    • Bangla, a language spoken by more than 230 million people worldwide, is significantly underrepresented in speech and sentiment analysis research when compared to high-resource languages. • This is addressed with the dataset. Researchers and developers working on low-resource language technologies, such as sentiment analysis, speech recognition, and multimodal learning frameworks, should find this extensive resource very helpful. • Sentiment-aware speech recognition, speech-based emotion detection, emotionally expressive text-to-speech systems, multimodal sentiment classification, and speaker-independent recognition models are just a few of the many applications that can be developed and evaluated using this dataset. • Its modular structure promotes continuous research expansion by enabling contributors to add new regional vocabularies, dialectal variations, or additional sentiment classes over time. • The dataset is precisely balanced, with 4,000 audio recordings created by four native speakers (two male and two female) and 500 samples for each sentiment category. The sentences capture the natural and everyday use of the Bangla language, spanning a wide range of topics that include events, emotions, personal experiences, and general statements.

  7. h

    roots_indic-bn_bangla_sentiment_classification_datasets

    • huggingface.co
    Updated Sep 23, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    BigScience Data (2022). roots_indic-bn_bangla_sentiment_classification_datasets [Dataset]. https://huggingface.co/datasets/bigscience-data/roots_indic-bn_bangla_sentiment_classification_datasets
    Explore at:
    Dataset updated
    Sep 23, 2022
    Dataset authored and provided by
    BigScience Data
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    ROOTS Subset: roots_indic-bn_bangla_sentiment_classification_datasets

      Bangla Sentiment Classification Datasets
    

    Dataset uid: bangla_sentiment_classification_datasets

      Description
    

    Multiple sentiment classification datasets for Bengali, which can also be used for training LMs. The Datasets are the following: ABSA_datasets -- This dataset has developed to perform aspect based sentiment analysis task in Bangla. License: CC BY 4.0 SAIL_data -- This dataset, consists of tweet… See the full description on the dataset page: https://huggingface.co/datasets/bigscience-data/roots_indic-bn_bangla_sentiment_classification_datasets.

  8. h

    bn_code_mix_sentiment_dataset

    • huggingface.co
    Updated Sep 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Swarnadeep Das (2025). bn_code_mix_sentiment_dataset [Dataset]. https://huggingface.co/datasets/Swarnadeep-28/bn_code_mix_sentiment_dataset
    Explore at:
    Dataset updated
    Sep 10, 2025
    Authors
    Swarnadeep Das
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Bengali-English Code-Mixed Sentiment Dataset

      Dataset Summary
    

    This dataset contains Bengali–English code-mixed social media text annotated for sentiment classification.The primary goal is to support research and applications in code-mixed NLP, especially sentiment analysis in low-resource Indic languages. The dataset combines and cleans multiple publicly available sources:

    BnSentMix: Bengali–English code-mixed sentiment dataset
    SentMix-3L: Multi-lingual code-mixed… See the full description on the dataset page: https://huggingface.co/datasets/Swarnadeep-28/bn_code_mix_sentiment_dataset.

  9. m

    Bangla ( Bengali ) sentiment analysis classification benchmark dataset...

    • data.mendeley.com
    Updated Jan 8, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Salim Sazzed (2021). Bangla ( Bengali ) sentiment analysis classification benchmark dataset corpus [Dataset]. http://doi.org/10.17632/p6zc7krs37.4
    Explore at:
    Dataset updated
    Jan 8, 2021
    Authors
    Salim Sazzed
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Bangla ( Bengali ) sentiment analysis dataset

    The repository contains 3307 Negative reviews and 8500 Positive reviews collected and manually annotated from Youtube Bengali drama.

    If you use this dataset, please cite the following paper-

    @inproceedings{sazzed2020cross, title={Cross-lingual sentiment classification in low-resource Bengali language}, author={Sazzed, Salim}, booktitle={Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020)}, pages={50--60}, year={2020} }

    If you have any questions, please email me- salimsazzad222@gmail.com.

  10. h

    bn_code_mix_sentiment_dataset

    • huggingface.co
    Updated Sep 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    asif mahmud joy (2025). bn_code_mix_sentiment_dataset [Dataset]. https://huggingface.co/datasets/asfaqur/bn_code_mix_sentiment_dataset
    Explore at:
    Dataset updated
    Sep 11, 2025
    Authors
    asif mahmud joy
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Bengali-English Code-Mixed Sentiment Dataset

      Dataset Summary
    

    This dataset contains Bengali–English code-mixed social media text annotated for sentiment classification.The primary goal is to support research and applications in code-mixed NLP, especially sentiment analysis in low-resource Indic languages. The dataset combines and cleans multiple publicly available sources:

    BnSentMix: Bengali–English code-mixed sentiment dataset
    SentMix-3L: Multi-lingual code-mixed… See the full description on the dataset page: https://huggingface.co/datasets/asfaqur/bn_code_mix_sentiment_dataset.

  11. m

    RevBangla: Bangla Product Sentiment Analysis Dataset

    • data.mendeley.com
    Updated Mar 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Saieef Sarower Sunny (2024). RevBangla: Bangla Product Sentiment Analysis Dataset [Dataset]. http://doi.org/10.17632/bnbbcdsf4m.1
    Explore at:
    Dataset updated
    Mar 6, 2024
    Authors
    Saieef Sarower Sunny
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Bangla Product Comments Dataset is a comprehensive collection of product reviews gathered from diverse ecommerce platforms in Bangladesh. This dataset offers a rich source of information reflecting customer opinions and sentiments towards various products available online. This dataset holds significant value for businesses, researchers, and data scientists interested in understanding consumer behavior, product perception, and sentiment analysis within the Bangladeshi ecommerce landscape. By leveraging this dataset, stakeholders can derive actionable insights to enhance product quality, marketing strategies, and overall customer satisfaction.

    Columns:

    1. Product_ID: A unique identifier for each product, facilitating organization and referencing.
    2. Date: The date when the comment was posted, providing temporal context for analysis.
    3. Customer Name: The name or identifier of the customer who submitted the comment, ensuring traceability and potential user segmentation.
    4. Rating: A numerical representation (typically on a scale of 1 to 5) reflecting the customer's overall satisfaction level with the product.
    5. Label Sentiment: A categorical label assigned to each comment indicating the sentiment expressed by the customer (e.g., positive, negative). This classification facilitates sentiment analysis tasks.
    6. Comment: The actual text of the customer's review or comment, conveying specific opinions, feedback, or experiences regarding the product.
  12. m

    BANGLA-ABSA: Unique Aspect Based Sentiment Analysis datasets in Bangla...

    • data.mendeley.com
    Updated Jul 9, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mahmudul Hasan (2024). BANGLA-ABSA: Unique Aspect Based Sentiment Analysis datasets in Bangla Language [Dataset]. http://doi.org/10.17632/998m4jy3m9.3
    Explore at:
    Dataset updated
    Jul 9, 2024
    Authors
    Mahmudul Hasan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In the Bangla language, sentiment analysis is becoming more and more significant. Aspect-based sentiment analysis (ABSA) predicts the sentiment polarity on an aspect level. The data were collected from numerous individuals with a minimum of two aspects. Every comment is a complex or compound sentence. The datasets are organized in a folder named "BANGLA_ABSA dataset" which has four Excel files, one for each of the datasets: Car_ABSA, Mobile_phone_ABSA, Movie_ABSA, and Restaurant_ABSA. Each Excel file contains three columns namely Id, Comment, and {Aspect category, Sentiment Polarity}. Car_ABSA, Mobile_phone_ABSA, Movie_ABSA, and Restaurant_ABSA datasets have 1149, 975, 800, and 801 rows of data respectively.

  13. m

    Bengali Political Sentiment Analysis Dataset

    • data.mendeley.com
    Updated Oct 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adib Mahmud (2025). Bengali Political Sentiment Analysis Dataset [Dataset]. http://doi.org/10.17632/x5yc4m5yg2.2
    Explore at:
    Dataset updated
    Oct 2, 2025
    Authors
    Adib Mahmud
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset comprises 3,290 Bengali political comments sourced from social media platforms, news comment sections, and online political discussions, specifically curated for sentiment analysis research in Bengali NLP. The corpus provides a comprehensive resource for training and evaluating sentiment classification models within the political domain. The dataset features 3,290 instances distributed across five sentiment classes with excellent balance (variance <8%): Very Negative (675, 20.5%), Negative (663, 20.2%), Neutral (626, 19.0%), Very Positive (664, 20.2%), and Positive (662, 20.1%). Stored in Excel format with two columns containing Bengali political comments (Unicode text) and corresponding sentiment labels, the dataset maintains high quality with no missing values and verified annotations. Comment lengths average 83 characters, ranging from 11 to 398 characters. The collection encompasses diverse political discourse including government policies and governance, electoral processes and democracy, political parties and leadership dynamics, social and economic issues, current affairs and political events, along with public opinion and citizen responses to political developments. This dataset serves multiple research purposes, including Bengali sentiment analysis model development and benchmarking, political discourse analysis and opinion mining, natural language processing research for low-resource languages, cross-lingual sentiment analysis studies, social media analytics for Bengali content, multi-class text classification research, and comparative political sentiment studies across different linguistic and cultural contexts.

  14. n

    Data Set For Sentiment Analysis On Bengali News Comments

    • narcis.nl
    • data.mendeley.com
    Updated Sep 15, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chowdhury, M (via Mendeley Data) (2019). Data Set For Sentiment Analysis On Bengali News Comments [Dataset]. http://doi.org/10.17632/n53xt69gnf.2
    Explore at:
    Dataset updated
    Sep 15, 2019
    Dataset provided by
    Data Archiving and Networked Services (DANS)
    Authors
    Chowdhury, M (via Mendeley Data)
    Description

    This is a data set of Sentiment Analysis On Bangla News Comments where every data was annotated by three different individuals to get three different perspectives and based on the majorities decisions the final tag was chosen. This data set contains 13802 data in total.

  15. m

    Bangla ( Bengali ) sentiment analysis dataset

    • data.mendeley.com
    Updated Jun 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Salim Sazzed (2020). Bangla ( Bengali ) sentiment analysis dataset [Dataset]. http://doi.org/10.17632/p6zc7krs37.1
    Explore at:
    Dataset updated
    Jun 24, 2020
    Authors
    Salim Sazzed
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Bangla ( Bengali ) sentiment analysis dataset

  16. m

    Multilabeled Bengali Sentiment and Emotion Classification Dataset

    • data.mendeley.com
    Updated May 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rownuk Ara Rumy (2025). Multilabeled Bengali Sentiment and Emotion Classification Dataset [Dataset]. http://doi.org/10.17632/6bmbf33nnw.2
    Explore at:
    Dataset updated
    May 28, 2025
    Authors
    Rownuk Ara Rumy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset, titled Multilabeled Sentiment and Emotion Classification Dataset, is developed to advance natural language processing (NLP) research in the Bengali language, particularly in the domains of sentiment analysis and emotion detection. It contains 40,811 entries of user-generated text from Bengali social media and comment sections, each annotated with two labels: one for sentiment and another for emotion.

    Sentiment Categories: The dataset is annotated with five sentiment classes:

    Very Negative – 8,979 entries (21.9%)

    Negative – 10,757 entries (26.3%)

    Neutral – 8,662 entries (21.2%)

    Positive – 7,186 entries (17.8%)

    Very Positive – 5,227 entries (12.8%)

    This distribution highlights a larger presence of negative and neutral sentiments, indicating a critical tone in the source data.

    Emotion Categories: There are seven emotion classes in the dataset:

    Happy – 8,154 entries (19.9%)

    Surprised – 3,470 entries (8.5%)

    Sexual – 7,250 entries (17.7%)

    Religious – 2,449 entries (6.0%)

    Calm – 7,583 entries (18.5%)

    Hateful – 4,919 entries (12.0%)

    Fearful – 7,086 entries (17.3%)

    This emotion distribution reveals that Happy, Calm, and Sexual emotions are among the most prevalent, while Religious and Surprised emotions are relatively less represented.

    Applications: This dataset is suitable for training and evaluating machine learning and deep learning models for tasks such as:

    Multilabel text classification

    Emotion recognition

    Sentiment analysis

    Hate speech and toxic comment detection in the Bengali language

    Language: Bengali (Bangla)

  17. n

    Bangla Bengali sentiment lexicon dictionary with positive and negative words...

    • narcis.nl
    • data.mendeley.com
    Updated Mar 9, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sazzed, S (via Mendeley Data) (2021). Bangla Bengali sentiment lexicon dictionary with positive and negative words [Dataset]. http://doi.org/10.17632/zggnjpnmwp.2
    Explore at:
    Dataset updated
    Mar 9, 2021
    Dataset provided by
    Data Archiving and Networked Services (DANS)
    Authors
    Sazzed, S (via Mendeley Data)
    Description

    This dataset contains around 1300 positive and negative Bengal ( Bangla ) sentiment words. This lexicon was created from a Bengali review corpus.

    If you use this lexicon please cite following paper-

    @inproceedings{sazzed2020development, title={Development of Sentiment Lexicon in Bengali utilizing Corpus and Cross-lingual Resources}, author={Sazzed, Salim}, booktitle={2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science (IRI)}, pages={237--244}, year={2020}, organization={IEEE Computer Society} }

    https://www.cs.odu.edu/~ssazzed/IEEE_IRI_2020.pdf

  18. Bengali Sentiment Dataset

    • kaggle.com
    zip
    Updated Jul 25, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nuhash Afnan (2020). Bengali Sentiment Dataset [Dataset]. https://www.kaggle.com/nuhashafnan/pseudolabel
    Explore at:
    zip(631287 bytes)Available download formats
    Dataset updated
    Jul 25, 2020
    Authors
    Nuhash Afnan
    Description

    Dataset

    This dataset was created by Nuhash Afnan

    Contents

  19. m

    MONOBHAV: A Large-Scale Bengali Dataset for Fine-Grained Sentiment Analysis

    • data.mendeley.com
    Updated Jan 27, 2026
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Meherunnesa Hossain Ibnath (2026). MONOBHAV: A Large-Scale Bengali Dataset for Fine-Grained Sentiment Analysis [Dataset]. http://doi.org/10.17632/968kvv98m4.3
    Explore at:
    Dataset updated
    Jan 27, 2026
    Authors
    Meherunnesa Hossain Ibnath
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    MONOBHAV is a Bengali dataset for fine-grained sentiment analysis. It contains 10,000 Bengali texts collected from social media platforms and newspaper websites. Each text is manually annotated by native Bengali speakers into five sentiment classes - Strongly Negative, Negative, Neutral, Positive, and Strongly Positive. This dataset enhances the resources available for Bengali sentiment analysis and supports the development and evaluation of more accurate sentiment models for the language.

  20. BanglaMUSE: Bangla Text–Audio Sentiment Dataset

    • kaggle.com
    zip
    Updated Jan 8, 2026
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yasout141516 (2026). BanglaMUSE: Bangla Text–Audio Sentiment Dataset [Dataset]. https://www.kaggle.com/datasets/yasout141516/banglamuse-bangla-textaudio-sentiment-dataset/code
    Explore at:
    zip(306676882 bytes)Available download formats
    Dataset updated
    Jan 8, 2026
    Authors
    Yasout141516
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BanglaMUSE is a multimodal Bangla sentiment dataset containing aligned text–audio pairs designed for research in sentiment analysis, speech processing, and multimodal learning for low-resource languages.

    The dataset includes 1,000 Bangla sentences, evenly balanced between positive (500) and negative (500) sentiment classes. The sentences represent natural, everyday Bangla language usage and were manually curated and validated to ensure clear sentiment polarity.

    Each sentence is recorded by four native Bangla speakers (two female and two male), resulting in 4,000 speech recordings in total. All speakers recorded the same set of sentences, enabling controlled analysis of speaker variability while preserving identical textual content. Audio samples are provided in MP3 format, recorded in controlled indoor environments, and manually verified for quality and alignment.

    The dataset is distributed with a unified metadata.csv file that links sentence identifiers, sentiment labels, speaker information, and relative audio paths. BanglaMUSE supports tasks such as multimodal sentiment classification, sentiment-aware speech recognition, audio–text alignment, and speaker-independent modeling.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Tasrif Nur Himel (2024). Sentiment Dataset (Bangla Text) [Dataset]. https://www.kaggle.com/datasets/tasrifnurhimel/sentiment-dataset-bangla-text
Organization logo

Sentiment Dataset (Bangla Text)

This dataset basically identify the comment is the comment negative or positive.

Explore at:
zip(639787 bytes)Available download formats
Dataset updated
Jan 2, 2024
Authors
Tasrif Nur Himel
License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

About this Dataset

This dataset is designed for sentiment analysis tasks, specifically to classify text comments as positive or negative. It's a supervised dataset, meaning each comment is already labeled with its corresponding sentiment.

Key Features:

Two Columns: - Text: Contains the raw text of the comments. - Tag: Indicates the sentiment of the comment, labeled as either "positive" or "negative."

Supervised Learning: Ideal for training and evaluating machine learning models for sentiment classification.

Potential Applications: - Sentiment Analysis: Build models to automatically analyze emotions and opinions in various text data. - Social Media Analysis: Understand public sentiment towards brands, products, or topics on social media platforms. - Customer Feedback Analysis: Gauge customer satisfaction and identify areas for improvement based on reviews and feedback. - Text Classification: Develop text categorization systems for diverse applications.

Search
Clear search
Close search
Google apps
Main menu