62 datasets found
  1. m

    Bangla Sentiment Dataset

    • data.mendeley.com
    Updated Jun 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jahanur Biswas (2025). Bangla Sentiment Dataset [Dataset]. http://doi.org/10.17632/rh67mckhbh.2
    Explore at:
    Dataset updated
    Jun 3, 2025
    Authors
    Jahanur Biswas
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Bangla Sentiment Dataset is a curated collection of sentiment-rich textual data in Bangla, focused on recent and trending topics. This dataset has been compiled from diverse sources, including Bangladeshi online newspapers, social media platforms, and blogs, ensuring a wide spectrum of language styles and sentiment expressions.

    Key Features: Focus on Recent Topics: The dataset emphasizes contemporary issues, trending discussions, and popular topics in Bangladeshi society. This includes sentiments on political developments, social movements, entertainment, cultural events, and other recent happenings.

    Source Variety:

    Online Newspapers: Articles, editorials, headlines, and reader comments provide structured and semi-formal sentiment data. Social Media: Posts, tweets, and comments reflect informal, conversational language with high emotional expressiveness. Blogs: Opinion pieces and discussions offer detailed and context-rich sentiment content. Sentiment Labels: Each entry in the dataset is annotated with one of the following sentiment categories:

    Positive (1): Texts expressing happiness, agreement, or optimism. Negative (0): Texts reflecting criticism, disagreement, or pessimism. Neutral (2): Texts presenting balanced or factual statements with minimal emotional bias. Linguistic and Stylistic Diversity: The dataset captures a range of Bangla language variations, including:

    Formal and informal Bangla usage. Regional dialects. Transliterated Bangla (Banglish) commonly used on social media. Real-World Context: The inclusion of recent topics ensures that the dataset is relevant for analyzing public sentiment around current events and trends. This makes it particularly useful for real-time sentiment analysis applications.

    This dataset provides an invaluable resource for researchers and practitioners aiming to explore sentiment analysis in Bangla, with a special emphasis on modern-day relevance and real-world applicability.

  2. h

    roots_indic-bn_bangla_sentiment_classification_datasets

    • huggingface.co
    Updated Sep 23, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    BigScience Data (2022). roots_indic-bn_bangla_sentiment_classification_datasets [Dataset]. https://huggingface.co/datasets/bigscience-data/roots_indic-bn_bangla_sentiment_classification_datasets
    Explore at:
    Dataset updated
    Sep 23, 2022
    Dataset authored and provided by
    BigScience Data
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    ROOTS Subset: roots_indic-bn_bangla_sentiment_classification_datasets

      Bangla Sentiment Classification Datasets
    

    Dataset uid: bangla_sentiment_classification_datasets

      Description
    

    Multiple sentiment classification datasets for Bengali, which can also be used for training LMs. The Datasets are the following: ABSA_datasets -- This dataset has developed to perform aspect based sentiment analysis task in Bangla. License: CC BY 4.0 SAIL_data -- This dataset, consists of tweet… See the full description on the dataset page: https://huggingface.co/datasets/bigscience-data/roots_indic-bn_bangla_sentiment_classification_datasets.

  3. n

    Data Set For Sentiment Analysis On Bengali News Comments

    • narcis.nl
    • data.mendeley.com
    Updated Sep 15, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chowdhury, M (via Mendeley Data) (2019). Data Set For Sentiment Analysis On Bengali News Comments [Dataset]. http://doi.org/10.17632/n53xt69gnf.2
    Explore at:
    Dataset updated
    Sep 15, 2019
    Dataset provided by
    Data Archiving and Networked Services (DANS)
    Authors
    Chowdhury, M (via Mendeley Data)
    Description

    This is a data set of Sentiment Analysis On Bangla News Comments where every data was annotated by three different individuals to get three different perspectives and based on the majorities decisions the final tag was chosen. This data set contains 13802 data in total.

  4. m

    BANGLA-ABSA: Unique Aspect Based Sentiment Analysis datasets in Bangla...

    • data.mendeley.com
    Updated Jul 9, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mahmudul Hasan (2024). BANGLA-ABSA: Unique Aspect Based Sentiment Analysis datasets in Bangla Language [Dataset]. http://doi.org/10.17632/998m4jy3m9.3
    Explore at:
    Dataset updated
    Jul 9, 2024
    Authors
    Mahmudul Hasan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In the Bangla language, sentiment analysis is becoming more and more significant. Aspect-based sentiment analysis (ABSA) predicts the sentiment polarity on an aspect level. The data were collected from numerous individuals with a minimum of two aspects. Every comment is a complex or compound sentence. The datasets are organized in a folder named "BANGLA_ABSA dataset" which has four Excel files, one for each of the datasets: Car_ABSA, Mobile_phone_ABSA, Movie_ABSA, and Restaurant_ABSA. Each Excel file contains three columns namely Id, Comment, and {Aspect category, Sentiment Polarity}. Car_ABSA, Mobile_phone_ABSA, Movie_ABSA, and Restaurant_ABSA datasets have 1149, 975, 800, and 801 rows of data respectively.

  5. h

    Bengali-E-commerce-sentiments

    • huggingface.co
    Updated Aug 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mahadi Hassan (2024). Bengali-E-commerce-sentiments [Dataset]. https://huggingface.co/datasets/Mahadih534/Bengali-E-commerce-sentiments
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 31, 2024
    Authors
    Mahadi Hassan
    License

    Attribution 2.5 (CC BY 2.5)https://creativecommons.org/licenses/by/2.5/
    License information was derived automatically

    Description

    Mahadih534/Bengali-E-commerce-sentiments dataset hosted on Hugging Face and contributed by the HF Datasets community

  6. n

    Bangla Bengali sentiment lexicon dictionary with positive and negative words...

    • narcis.nl
    • data.mendeley.com
    Updated Mar 9, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sazzed, S (via Mendeley Data) (2021). Bangla Bengali sentiment lexicon dictionary with positive and negative words [Dataset]. http://doi.org/10.17632/zggnjpnmwp.2
    Explore at:
    Dataset updated
    Mar 9, 2021
    Dataset provided by
    Data Archiving and Networked Services (DANS)
    Authors
    Sazzed, S (via Mendeley Data)
    Description

    This dataset contains around 1300 positive and negative Bengal ( Bangla ) sentiment words. This lexicon was created from a Bengali review corpus.

    If you use this lexicon please cite following paper-

    @inproceedings{sazzed2020development, title={Development of Sentiment Lexicon in Bengali utilizing Corpus and Cross-lingual Resources}, author={Sazzed, Salim}, booktitle={2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science (IRI)}, pages={237--244}, year={2020}, organization={IEEE Computer Society} }

    https://www.cs.odu.edu/~ssazzed/IEEE_IRI_2020.pdf

  7. BAN-ABSA

    • kaggle.com
    Updated Oct 3, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mahfuz Ahmed Masum (2020). BAN-ABSA [Dataset]. https://www.kaggle.com/datasets/mahfuzahmed/banabsa
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 3, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Mahfuz Ahmed Masum
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Mahfuz Ahmed Masum

    Released under CC0: Public Domain

    Contents

  8. h

    synthetic-bengali-sentiment

    • huggingface.co
    Updated Aug 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    shaikh R (2025). synthetic-bengali-sentiment [Dataset]. http://doi.org/10.57967/hf/5762
    Explore at:
    Dataset updated
    Aug 1, 2025
    Authors
    shaikh R
    License

    https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/

    Description

    Bengali Sentiment Analysis Dataset

      Dataset Description
    

    This dataset contains 44,236 Bengali sentences with corresponding sentiment labels, synthetically generated using ChatGPT for natural language processing and machine learning research.

      Dataset Summary
    

    Language: Bengali (বাংলা) Total Entries: 44,236 synthetic sentences Task: Sentiment Classification Format: JSON Generation Method: OpenAI ChatGPT (GPT-4) License: CC0 1.0 Universal (Public Domain)… See the full description on the dataset page: https://huggingface.co/datasets/shaikh25/synthetic-bengali-sentiment.

  9. f

    Bangla (Bengali) Drama Review Dataset

    • figshare.com
    txt
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    salim sazzed (2023). Bangla (Bengali) Drama Review Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.13162085.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    figshare
    Authors
    salim sazzed
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The repository contains 3307 Negative reviews and 8500 Positive reviews collected and manually annotated from Youtube Bengali drama.If you use this dataset, please cite the following paper-@inproceedings{sazzed2020cross,title={Cross-lingual sentiment classification in low-resource Bengali language},author={Sazzed, Salim},booktitle={Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020)},pages={50--60},year={2020}

    }If you have any questions, please email me- salimsazzad222@gmail.com.

  10. Bengali Identity Bias Evaluation Dataset (BIBED)

    • zenodo.org
    • data.niaid.nih.gov
    bin
    Updated Aug 7, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dipto Das; Dipto Das; Shion Guha; Shion Guha; Bryan Semaan; Bryan Semaan (2023). Bengali Identity Bias Evaluation Dataset (BIBED) [Dataset]. http://doi.org/10.5281/zenodo.7775521
    Explore at:
    binAvailable download formats
    Dataset updated
    Aug 7, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Dipto Das; Dipto Das; Shion Guha; Shion Guha; Bryan Semaan; Bryan Semaan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Critical studies found NLP systems to bias based on gender and racial identities. However, few studies focused on identities defined by cultural factors like religion and nationality. Compared to English, such research efforts are even further limited in major languages like Bengali due to the unavailability of labeled datasets. Our paper (see the reference) describes a process for developing a bias evaluation dataset highlighting cultural influences on identity. We also provide this Bengali dataset as an artifact outcome that can contribute to future critical research.

    If you find this dataset useful, please cite the associated paper:

    Das, D., Guha, S., & Semaan, B. (2023, May). Toward Cultural Bias Evaluation Datasets: The Case of Bengali Gender, Religious, and National Identity. In Proceedings of the First Workshop on Cross-Cultural Considerations in NLP (C3NLP) (pp. 68-83).

    BibTeX:

    @inproceedings{das-etal-2023-toward,
      title = "Toward Cultural Bias Evaluation Datasets: The Case of {B}engali Gender, Religious, and National Identity",
      author = "Das, Dipto and
       Guha, Shion and
       Semaan, Bryan",
      booktitle = "Proceedings of the First Workshop on Cross-Cultural Considerations in NLP (C3NLP)",
      month = may,
      year = "2023",
      address = "Dubrovnik, Croatia",
      publisher = "Association for Computational Linguistics",
      url = "https://aclanthology.org/2023.c3nlp-1.8",
      pages = "68--83",
    }
  11. f

    Data from: Twitter corpus of Resource-Scarce Languages for Sentiment...

    • figshare.com
    zip
    Updated Jun 12, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rajat Singh; Nurendra Choudhary (2018). Twitter corpus of Resource-Scarce Languages for Sentiment Analysis and Multilingual Emoji Prediction [Dataset]. http://doi.org/10.6084/m9.figshare.6477782.v6
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 12, 2018
    Dataset provided by
    figshare
    Authors
    Rajat Singh; Nurendra Choudhary
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is created by leveraging the social media platforms such as twitter for developing corpus across multiple languages. The corpus creation methodology is applicable for resource-scarce languages provided the speakers of that particular language are active users on social media platforms. We present an approach to extract social media microblogs such as tweets (Twitter). We created corpus for multilingual sentiment analysis and emoji prediction in Hindi, Bengali and Telugu. Further, we perform and analyze multiple NLP tasks utilizing the corpus to get interesting observations.

  12. m

    Data from: ANUBHUTI: A COMPREHENSIVE CORPUS FOR SENTIMENT ANALYSIS IN BANGLA...

    • data.mendeley.com
    Updated Jun 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Swastika Kundu (2025). ANUBHUTI: A COMPREHENSIVE CORPUS FOR SENTIMENT ANALYSIS IN BANGLA REGIONAL LANGUAGES [Dataset]. http://doi.org/10.17632/mjxwby94yw.1
    Explore at:
    Dataset updated
    Jun 26, 2025
    Authors
    Swastika Kundu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ANUBHUTI, a comprehensive dataset consisting of 2,000 sentences manually translated from standard Bangla into four major regional dialects—Mymensingh, Noakhali, Sylhet, and Chittagong. The dataset predominantly features political and religious content, reflecting the contemporary socio-political landscape of Bangladesh, alongside neutral texts to maintain balance. Each sentence is annotated using a dual annotation scheme: (i) multiclass thematic labeling categorizes sentences as Political, Religious, or Neutral, and (ii) multilabel emotion annotation assigns one or more emotions from Anger, Contempt, Disgust, Enjoyment, Fear, Sadness, and Surprise.

  13. m

    Data from: BanglaSarc3: A Benchmark Dataset for Bangla Sarcasm Detection...

    • data.mendeley.com
    Updated Feb 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Susmoy Biswas (2025). BanglaSarc3: A Benchmark Dataset for Bangla Sarcasm Detection from Social Media to Advance Bangla NLP [Dataset]. http://doi.org/10.17632/7tn76wdhsr.1
    Explore at:
    Dataset updated
    Feb 24, 2025
    Authors
    Susmoy Biswas
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BanglaSarc3 dataset serves as a benchmark resource for sarcasm classification in Bangla, ensuring balanced category representation. The primary objective of BanglaSarc3 is to mitigate humor misinterpretation that often leads to digital conflicts and misunderstandings in online communication. To enhance dataset quality, preprocessing steps such as anonymization, duplicate removal, and text normalization were applied. Additionally, three native Bangla speakers independently reviewed and validated the labels, ensuring annotation reliability.

    BanglaSarc3 introduce BanglaSarc3, a ternary-class dataset containing 12,089 Facebook comments, categorized as follows: - Neutral: 4,056 comments - Sarcastic: 4,012 comments - Non-Sarcastic: 4,021 comments

    The BanglaSarc3 dataset has significant implications across multiple NLP and AI domains, including: 1. Sarcasm Detection in Bangla Social Media 2. Sentiment and Emotion Analysis 3. Language Modeling and BNLP Advancements 4. Explainable AI (XAI) in Bangla NLP 5. Educational and Research Applications

    The BanglaSarc3 dataset is openly available for academic and research purposes, fostering collaboration and innovation within the Bangla NLP community. By providing a robust foundation for sarcasm classification, this dataset aims to drive advancements in Bangla-centric AI applications, ensuring more inclusive and context-aware language models.

  14. h

    RBE_Sent

    • huggingface.co
    Updated Apr 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dalia Barua (2025). RBE_Sent [Dataset]. https://huggingface.co/datasets/DaliaBarua/RBE_Sent
    Explore at:
    Dataset updated
    Apr 20, 2025
    Authors
    Dalia Barua
    Description

    RBE_Sent Dataset Description: The RBE_Sent (Roman Bengali-English Sentiment) dataset is a synthetic, gold-standard code-mixed dataset developed for sentiment analysis tasks involving Romanized Bengali and English. It captures real-world bilingual usage by blending Roman Bengali with English tokens within the same textual instances. The dataset is designed to support research in multilingual natural language processing, especially in the context of low-resource, code-mixed languages. Each… See the full description on the dataset page: https://huggingface.co/datasets/DaliaBarua/RBE_Sent.

  15. f

    Overview of literature review.

    • plos.figshare.com
    xls
    Updated Sep 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shihab Ahmed; Moythry Manir Samia; Maksuda Haider Sayma; Md. Mohsin Kabir; M. F. Mridha (2024). Overview of literature review. [Dataset]. http://doi.org/10.1371/journal.pone.0308050.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Sep 20, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Shihab Ahmed; Moythry Manir Samia; Maksuda Haider Sayma; Md. Mohsin Kabir; M. F. Mridha
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In recent years, the surge in reviews and comments on newspapers and social media has made sentiment analysis a focal point of interest for researchers. Sentiment analysis is also gaining popularity in the Bengali language. However, Aspect-Based Sentiment Analysis is considered a difficult task in the Bengali language due to the shortage of perfectly labeled datasets and the complex variations in the Bengali language. This study used two open-source benchmark datasets of the Bengali language, Cricket, and Restaurant, for our Aspect-Based Sentiment Analysis task. The original work was based on the Random Forest, Support Vector Machine, K-Nearest Neighbors, and Convolutional Neural Network models. In this work, we used the Bidirectional Encoder Representations from Transformers, the Robustly Optimized BERT Approach, and our proposed hybrid transformative Random Forest and Bidirectional Encoder Representations from Transformers (tRF-BERT) models to compare the results with the existing work. After comparing the results, we can clearly see that all the models used in our work achieved better results than any of the previous works on the same dataset. Amongst them, our proposed transformative Random Forest and Bidirectional Encoder Representations from Transformers achieved the highest F1 score and accuracy. The accuracy and F1 score of aspect detection for the Cricket dataset were 0.89 and 0.85, respectively, and for the Restaurant dataset were 0.92 and 0.89 respectively.

  16. f

    Statistics of cricket dataset.

    • figshare.com
    xls
    Updated Sep 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shihab Ahmed; Moythry Manir Samia; Maksuda Haider Sayma; Md. Mohsin Kabir; M. F. Mridha (2024). Statistics of cricket dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0308050.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Sep 20, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Shihab Ahmed; Moythry Manir Samia; Maksuda Haider Sayma; Md. Mohsin Kabir; M. F. Mridha
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In recent years, the surge in reviews and comments on newspapers and social media has made sentiment analysis a focal point of interest for researchers. Sentiment analysis is also gaining popularity in the Bengali language. However, Aspect-Based Sentiment Analysis is considered a difficult task in the Bengali language due to the shortage of perfectly labeled datasets and the complex variations in the Bengali language. This study used two open-source benchmark datasets of the Bengali language, Cricket, and Restaurant, for our Aspect-Based Sentiment Analysis task. The original work was based on the Random Forest, Support Vector Machine, K-Nearest Neighbors, and Convolutional Neural Network models. In this work, we used the Bidirectional Encoder Representations from Transformers, the Robustly Optimized BERT Approach, and our proposed hybrid transformative Random Forest and Bidirectional Encoder Representations from Transformers (tRF-BERT) models to compare the results with the existing work. After comparing the results, we can clearly see that all the models used in our work achieved better results than any of the previous works on the same dataset. Amongst them, our proposed transformative Random Forest and Bidirectional Encoder Representations from Transformers achieved the highest F1 score and accuracy. The accuracy and F1 score of aspect detection for the Cricket dataset were 0.89 and 0.85, respectively, and for the Restaurant dataset were 0.92 and 0.89 respectively.

  17. m

    Motamot: A Dataset for Revealing the Supremacy of Large Language Models over...

    • data.mendeley.com
    Updated May 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fatema Tuj Johora Faria (2024). Motamot: A Dataset for Revealing the Supremacy of Large Language Models over Transformer Models in Bengali Political Sentiment Analysis [Dataset]. http://doi.org/10.17632/hdhnrrwdz2.1
    Explore at:
    Dataset updated
    May 13, 2024
    Authors
    Fatema Tuj Johora Faria
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset "Motamot" containing 7,058 data points labeled with Positive and Negative sentiments, tailored specifically for Political Sentiment Analysis in the Bengali language. The dataset comprises 4,132 instances labeled as Positive and 2,926 instances labeled as Negative sentiments.

    Specifics of the Core Data: —------------------------------- Train 5647, Test 706, Validation 705

    Train : —-------------------------------

    Positive: 3306

    Negative: 2341

    Test : —-------------------------------

    Positive: 413

    Negative: 293

    Validation : —-------------------------------

    Positive: 413

    Negative: 292

  18. D

    Bengali Podcast

    • defined.ai
    Updated Jun 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Defined.ai (2025). Bengali Podcast [Dataset]. https://defined.ai/datasets/bengali-podcast
    Explore at:
    Dataset updated
    Jun 13, 2025
    Dataset provided by
    Defined.ai
    Description

    777 hours of live, high-quality Bengali podcast data across domains for Conversational AI, Text-to-Speech, Emotion Detection and Sentiment Analysis.

  19. h

    Cricket_Sentiment

    • huggingface.co
    Updated Jul 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tanjim Mahmud (2025). Cricket_Sentiment [Dataset]. http://doi.org/10.57967/hf/5982
    Explore at:
    Dataset updated
    Jul 15, 2025
    Authors
    Tanjim Mahmud
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Published Paper Information:>>>>>>>>>>>>>>>>>>>>>>>> If you use this dataset, please cite the following paper:

    @article{mahmud2024benchmark, title={A benchmark dataset for cricket sentiment analysis in bangla social media text}, author={Mahmud, Tanjim and Karim, Rezaul and Chakma, Rishita and Chowdhury, Tanjia and Hossain, Mohammad Shahadat and Andersson, Karl}, journal={Procedia Computer Science}, volume={238}, pages={377--384}, year={2024}, publisher={Elsevier} } ,,,

  20. f

    Evaluating this study in relation to prior research on aspect detection in...

    • plos.figshare.com
    xls
    Updated Sep 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shihab Ahmed; Moythry Manir Samia; Maksuda Haider Sayma; Md. Mohsin Kabir; M. F. Mridha (2024). Evaluating this study in relation to prior research on aspect detection in Bengali ABSA. [Dataset]. http://doi.org/10.1371/journal.pone.0308050.t014
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Sep 20, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Shihab Ahmed; Moythry Manir Samia; Maksuda Haider Sayma; Md. Mohsin Kabir; M. F. Mridha
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Evaluating this study in relation to prior research on aspect detection in Bengali ABSA.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Jahanur Biswas (2025). Bangla Sentiment Dataset [Dataset]. http://doi.org/10.17632/rh67mckhbh.2

Bangla Sentiment Dataset

Explore at:
241 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Jun 3, 2025
Authors
Jahanur Biswas
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The Bangla Sentiment Dataset is a curated collection of sentiment-rich textual data in Bangla, focused on recent and trending topics. This dataset has been compiled from diverse sources, including Bangladeshi online newspapers, social media platforms, and blogs, ensuring a wide spectrum of language styles and sentiment expressions.

Key Features: Focus on Recent Topics: The dataset emphasizes contemporary issues, trending discussions, and popular topics in Bangladeshi society. This includes sentiments on political developments, social movements, entertainment, cultural events, and other recent happenings.

Source Variety:

Online Newspapers: Articles, editorials, headlines, and reader comments provide structured and semi-formal sentiment data. Social Media: Posts, tweets, and comments reflect informal, conversational language with high emotional expressiveness. Blogs: Opinion pieces and discussions offer detailed and context-rich sentiment content. Sentiment Labels: Each entry in the dataset is annotated with one of the following sentiment categories:

Positive (1): Texts expressing happiness, agreement, or optimism. Negative (0): Texts reflecting criticism, disagreement, or pessimism. Neutral (2): Texts presenting balanced or factual statements with minimal emotional bias. Linguistic and Stylistic Diversity: The dataset captures a range of Bangla language variations, including:

Formal and informal Bangla usage. Regional dialects. Transliterated Bangla (Banglish) commonly used on social media. Real-World Context: The inclusion of recent topics ensures that the dataset is relevant for analyzing public sentiment around current events and trends. This makes it particularly useful for real-time sentiment analysis applications.

This dataset provides an invaluable resource for researchers and practitioners aiming to explore sentiment analysis in Bangla, with a special emphasis on modern-day relevance and real-world applicability.

Search
Clear search
Close search
Google apps
Main menu