43 datasets found

h
Bengali-E-commerce-sentiments
huggingface.co
Updated Aug 31, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mahadi Hassan (2024). Bengali-E-commerce-sentiments [Dataset]. https://huggingface.co/datasets/Mahadih534/Bengali-E-commerce-sentiments
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 31, 2024
Authors
Mahadi Hassan
License
Attribution 2.5 (CC BY 2.5)https://creativecommons.org/licenses/by/2.5/
License information was derived automatically
Description
Mahadih534/Bengali-E-commerce-sentiments dataset hosted on Hugging Face and contributed by the HF Datasets community
n
Data Set For Sentiment Analysis On Bengali News Comments
narcis.nl
data.mendeley.com
Updated Sep 15, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chowdhury, M (via Mendeley Data) (2019). Data Set For Sentiment Analysis On Bengali News Comments [Dataset]. http://doi.org/10.17632/n53xt69gnf.2
Explore at:
Unique identifier
https://doi.org/10.17632/n53xt69gnf.2
Dataset updated
Sep 15, 2019
Dataset provided by
Data Archiving and Networked Services (DANS)
Authors
Chowdhury, M (via Mendeley Data)
Description
This is a data set of Sentiment Analysis On Bangla News Comments where every data was annotated by three different individuals to get three different perspectives and based on the majorities decisions the final tag was chosen. This data set contains 13802 data in total.
f
Bangla (Bengali) Drama Review Dataset
figshare.com
txt
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
salim sazzed (2023). Bangla (Bengali) Drama Review Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.13162085.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.13162085.v1
Dataset updated
May 30, 2023
Dataset provided by
figshare
Authors
salim sazzed
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The repository contains 3307 Negative reviews and 8500 Positive reviews collected and manually annotated from Youtube Bengali drama.If you use this dataset, please cite the following paper-@inproceedings{sazzed2020cross,title={Cross-lingual sentiment classification in low-resource Bengali language},author={Sazzed, Salim},booktitle={Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020)},pages={50--60},year={2020}

}If you have any questions, please email me- salimsazzad222@gmail.com.
h
roots_indic-bn_bangla_sentiment_classification_datasets
huggingface.co
Updated Sep 23, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BigScience Data (2022). roots_indic-bn_bangla_sentiment_classification_datasets [Dataset]. https://huggingface.co/datasets/bigscience-data/roots_indic-bn_bangla_sentiment_classification_datasets
Explore at:
Dataset updated
Sep 23, 2022
Dataset authored and provided by
BigScience Data
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
ROOTS Subset: roots_indic-bn_bangla_sentiment_classification_datasets

Bangla Sentiment Classification Datasets

Dataset uid: bangla_sentiment_classification_datasets

Description

Multiple sentiment classification datasets for Bengali, which can also be used for training LMs. The Datasets are the following: ABSA_datasets -- This dataset has developed to perform aspect based sentiment analysis task in Bangla. License: CC BY 4.0 SAIL_data -- This dataset, consists of tweet… See the full description on the dataset page: https://huggingface.co/datasets/bigscience-data/roots_indic-bn_bangla_sentiment_classification_datasets.
Bangla-Sentiment-Analysis
kaggle.com
Updated Jan 18, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FARID (2021). Bangla-Sentiment-Analysis [Dataset]. https://www.kaggle.com/datasets/faridmiah/banglasentimentanalysis
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 18, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
FARID
Description
Context

2-2325: From Twitter datasets May-November 2013 2326-16127: From http://dx.doi.org/10.17632/n53xt69gnf.3

Content

What's inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too.

Acknowledgements

We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.

Inspiration

Your data will be in front of the world's largest data science community. What questions do you want to see answered?
Bangla Financial lexicon Sentiment dictionary
kaggle.com
Updated Jul 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Md. Ashraful Islam (2023). Bangla Financial lexicon Sentiment dictionary [Dataset]. https://www.kaggle.com/datasets/mdashrafulislam1998/bangla-financial-lexicon-data-dictionary
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 30, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Md. Ashraful Islam
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Welcome to the Bangla Financial Lexicon Data Dictionary project!

The financial lexicon data dictionary is a list of words used to calculate the sentiment of financial news articles. Bangla words were collected from an online Bangla dictionary API and manually categorized into 6 weighted groups. To accurately determine the sentiment of sentences, a lexicon data dictionary is required. This project's lexicon data dictionary only contains Bangla words and includes words with positive sentiment and words with negative sentiment.

This dataset was a crucial part of our research published in the journal paper titled "Stock Market Prediction of Bangladesh Using Multivariate Long Short-Term Memory with Sentiment Identification." The paper can be accessed and cited at http://doi.org/10.11591/ijece.v13i5.pp5696-5706.

Understanding the Categories:

Bull words This word collection is called bull words because, from a financial standpoint, they are considered to have positive connotations. These words are typically associated with upward market trends, increasing stock prices, and overall economic growth. In this sense, bull words are viewed as desirable and are often used by financial analysts and investors to convey optimism about the state of the economy.

Bear words Bear word list is the opposite of positive sentimental words in financial sentiment analysis. For the purpose of evaluating the sentiment around business news, every phrase on this list is regarded as having a contradictory sentiment. Bear word lists typically consist of words that are associated with downward trends in the stock market, such as recession, inflation, unemployment, and bankruptcy.

Negative words Negative word list has words like “ন়া”, “নয়”, and “নেই” which can make a full sentence negative in the Bangla language. These negative words can have a significant impact on the overall sentiment of a sentence, even if the other words in the sentence are positive. The negative word list is a crucial tool for sentiment analysis in the Bangla language.

Coordinating conjunction words (Co con.) In the Bangla language conjunctions like “কিন্তু”, “আদপে”, “এবং”, “অথবা” plays an important role in sentence making. They should have their own weighted effect value in sentiment analysis. By assigning weighted effect values to conjunctions in Bangla language, resulting in more accurate sentiment analysis.

Subordinating conjunctions (Sub con.) Another kind of conjunctions list with words like "অধিকন্ত", "এমনকি", "বিশেষত". These conjunctions are often used to indicate a shift in tone or emphasis in a sentence and can play a significant role in shaping the overall sentiment. By assigning weighted effect values to these conjunctions, financial analysts can further refine their sentiment analysis, providing even more accurate insights into the sentiment of financial news and information.

Adjectives and adverbs (Adj.) We listed some adjectives and adverbs like "সবচাইতে", "অধিক", "সর্বাধিক" as they are used to glorify the sentence sentiment more than other simple words. We categorized them into 3 weighted categories: high, medium, and low. Words with high weight have the greatest impact, words with medium weight have a moderate impact, and words with low weight have the least impact.
Z
Bengali Identity Bias Evaluation Dataset (BIBED)
data.niaid.nih.gov
kaggle.com
Updated Aug 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Das, Dipto (2023). Bengali Identity Bias Evaluation Dataset (BIBED) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7775520
Explore at:
Dataset updated
Aug 7, 2023
Dataset provided by
Semaan, Bryan
Guha, Shion
Das, Dipto
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Critical studies found NLP systems to bias based on gender and racial identities. However, few studies focused on identities defined by cultural factors like religion and nationality. Compared to English, such research efforts are even further limited in major languages like Bengali due to the unavailability of labeled datasets. Our paper (see the reference) describes a process for developing a bias evaluation dataset highlighting cultural influences on identity. We also provide this Bengali dataset as an artifact outcome that can contribute to future critical research.

If you find this dataset useful, please cite the associated paper:

Das, D., Guha, S., & Semaan, B. (2023, May). Toward Cultural Bias Evaluation Datasets: The Case of Bengali Gender, Religious, and National Identity. In Proceedings of the First Workshop on Cross-Cultural Considerations in NLP (C3NLP) (pp. 68-83).

BibTeX:

@inproceedings{das-etal-2023-toward, title = "Toward Cultural Bias Evaluation Datasets: The Case of {B}engali Gender, Religious, and National Identity", author = "Das, Dipto and Guha, Shion and Semaan, Bryan", booktitle = "Proceedings of the First Workshop on Cross-Cultural Considerations in NLP (C3NLP)", month = may, year = "2023", address = "Dubrovnik, Croatia", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2023.c3nlp-1.8", pages = "68--83", }
h
synthetic-bengali-sentiment
huggingface.co
Updated Jun 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
shaikh R (2025). synthetic-bengali-sentiment [Dataset]. http://doi.org/10.57967/hf/5762
Explore at:
Unique identifier
https://doi.org/10.57967/hf/5762
Dataset updated
Jun 12, 2025
Authors
shaikh R
License
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Description
Bengali Sentiment Analysis Dataset

Dataset Description

This dataset contains 44,236 Bengali sentences with corresponding sentiment labels, synthetically generated using ChatGPT for natural language processing and machine learning research.

Dataset Summary

Language: Bengali (বাংলা) Total Entries: 44,236 synthetic sentences Task: Sentiment Classification Format: JSON Generation Method: OpenAI ChatGPT (GPT-4) License: CC0 1.0 Universal (Public Domain)… See the full description on the dataset page: https://huggingface.co/datasets/shaikh25/synthetic-bengali-sentiment.
m
Motamot: A Dataset for Revealing the Supremacy of Large Language Models over...
data.mendeley.com
Updated May 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fatema Tuj Johora Faria (2024). Motamot: A Dataset for Revealing the Supremacy of Large Language Models over Transformer Models in Bengali Political Sentiment Analysis [Dataset]. http://doi.org/10.17632/hdhnrrwdz2.1
Explore at:
Unique identifier
https://doi.org/10.17632/hdhnrrwdz2.1
Dataset updated
May 13, 2024
Authors
Fatema Tuj Johora Faria
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset "Motamot" containing 7,058 data points labeled with Positive and Negative sentiments, tailored specifically for Political Sentiment Analysis in the Bengali language. The dataset comprises 4,132 instances labeled as Positive and 2,926 instances labeled as Negative sentiments.

Specifics of the Core Data: —------------------------------- Train 5647, Test 706, Validation 705

Train : —-------------------------------

Positive: 3306

Negative: 2341

Test : —-------------------------------

Positive: 413

Negative: 293

Validation : —-------------------------------

Positive: 413

Negative: 292
f
Data from: Twitter corpus of Resource-Scarce Languages for Sentiment...
figshare.com
zip
Updated Jun 12, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rajat Singh; Nurendra Choudhary (2018). Twitter corpus of Resource-Scarce Languages for Sentiment Analysis and Multilingual Emoji Prediction [Dataset]. http://doi.org/10.6084/m9.figshare.6477782.v6
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.6477782.v6
Dataset updated
Jun 12, 2018
Dataset provided by
figshare
Authors
Rajat Singh; Nurendra Choudhary
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is created by leveraging the social media platforms such as twitter for developing corpus across multiple languages. The corpus creation methodology is applicable for resource-scarce languages provided the speakers of that particular language are active users on social media platforms. We present an approach to extract social media microblogs such as tweets (Twitter). We created corpus for multilingual sentiment analysis and emoji prediction in Hindi, Bengali and Telugu. Further, we perform and analyze multiple NLP tasks utilizing the corpus to get interesting observations.
Bengali Fake News Dataset
kaggle.com
Updated Aug 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Evil Spirit05 (2024). Bengali Fake News Dataset [Dataset]. https://www.kaggle.com/datasets/evilspirit05/bengali-fake-news-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 6, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Evil Spirit05
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This dataset offers a comprehensive collection of Bengali news articles specifically curated for the purpose of fake news detection. The data has been meticulously gathered and processed to aid researchers and practitioners in developing and testing models for distinguishing between real and fake news in the Bengali language.

Key Features:

Source: The dataset includes news articles scraped from popular news websites and public APIs. Major sources include reputable Bengali news portals to ensure a diverse range of content.

Coverage: The dataset spans from January 2018 to November 2018, providing a rich historical perspective on news trends.

Attributes:

Author’s Name: The individual or organization responsible for the news article.

Title: The headline of the news article.

Main Body: The full text of the news article.

News Date: The publication date of the article.

URL: The web address where the article was originally published.

Country Name: The country associated with the news source.

Source: The original news outlet or media.

Word Count: The total number of words in the article.

Data Processing:

The dataset underwent extensive cleaning to remove HTML symbols, unusual commas, and other formatting issues. It was then structured into a CSV format for ease of use and analysis. The data is well-suited for training and evaluating machine learning models aimed at fake news detection, text classification, and sentiment analysis.

Applications:

Fake News Detection: Train models to identify fake news articles.

Text Classification: Classify news articles based on their content.

Sentiment Analysis: Analyze the sentiment of Bengali news articles.

This dataset is an invaluable resource for researchers, developers, and data scientists working on text classification and fake news detection in Bengali. Its extensive coverage and detailed attributes provide a robust foundation for developing advanced analytical and machine learning models.
O
Bengali Hate Speech
opendatalab.com
paperswithcode.com
zip
Updated Sep 21, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Institute of Data Science, National University of Ireland (2022). Bengali Hate Speech [Dataset]. https://opendatalab.com/OpenDataLab/Bengali_Hate_Speech
Explore at:
zip(2949793 bytes)Available download formats
Dataset updated
Sep 21, 2022
Dataset provided by
RWTH Aachen University
Nanjing University of Science and Technology
Institute of Data Science, National University of Ireland
Vrije University Amsterdam
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Introduces three datasets of expressing hate, commonly used topics, and opinions for hate speech detection, document classification, and sentiment analysis, respectively.
m
BanglaBlend: A Large-Scale Nobel Dataset of Bangla Sentences Categorized by...
data.mendeley.com
Updated Dec 9, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Umme Ayman ayman (2024). BanglaBlend: A Large-Scale Nobel Dataset of Bangla Sentences Categorized by Saint(Sadhu) and Common(Cholito) Form of Bengali Language [Dataset]. http://doi.org/10.17632/7rx9mk8v4m.3
Explore at:
Unique identifier
https://doi.org/10.17632/7rx9mk8v4m.3
Dataset updated
Dec 9, 2024
Authors
Umme Ayman ayman
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This BanglaBlend dataset is a comprehensive collection of Bangla (Bengali) sentences meticulously categorized based on two specific forms: Saint(Sadhu) and Common(Cholito). This dataset is comprised of a total 7350 annotated Bangla sentences as well as it is preprocessed dataset where several data preprocessing techniques have been applied. This dataset is designed to facilitate research and development in natural language processing (NLP) and computational linguistics, particularly for Bangla, a widely spoken language in Bangladesh and parts of India. Specially, this dataset can be leveraged for several natural language processing task such as text summarization, text classification, sentiment analysis, automatic language translation.
P
SentNoB Dataset
paperswithcode.com
Updated Jan 24, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Khondoker Ittehadul Islam; Sudipta Kar; Md Saiful Islam; Mohammad Ruhul Amin (2024). SentNoB Dataset [Dataset]. https://paperswithcode.com/dataset/sentnob
Explore at:
Dataset updated
Jan 24, 2024
Authors
Khondoker Ittehadul Islam; Sudipta Kar; Md Saiful Islam; Mohammad Ruhul Amin
Description
Social Media User Sentiment Analysis Dataset. Each user comments are labeled with either positive (1), negative (2), or neutral (0).
f
Aspect detection for restaurant dataset.
plos.figshare.com
xls
Updated Sep 20, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shihab Ahmed; Moythry Manir Samia; Maksuda Haider Sayma; Md. Mohsin Kabir; M. F. Mridha (2024). Aspect detection for restaurant dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0308050.t009
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0308050.t009
Dataset updated
Sep 20, 2024
Dataset provided by
PLOS ONE
Authors
Shihab Ahmed; Moythry Manir Samia; Maksuda Haider Sayma; Md. Mohsin Kabir; M. F. Mridha
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In recent years, the surge in reviews and comments on newspapers and social media has made sentiment analysis a focal point of interest for researchers. Sentiment analysis is also gaining popularity in the Bengali language. However, Aspect-Based Sentiment Analysis is considered a difficult task in the Bengali language due to the shortage of perfectly labeled datasets and the complex variations in the Bengali language. This study used two open-source benchmark datasets of the Bengali language, Cricket, and Restaurant, for our Aspect-Based Sentiment Analysis task. The original work was based on the Random Forest, Support Vector Machine, K-Nearest Neighbors, and Convolutional Neural Network models. In this work, we used the Bidirectional Encoder Representations from Transformers, the Robustly Optimized BERT Approach, and our proposed hybrid transformative Random Forest and Bidirectional Encoder Representations from Transformers (tRF-BERT) models to compare the results with the existing work. After comparing the results, we can clearly see that all the models used in our work achieved better results than any of the previous works on the same dataset. Amongst them, our proposed transformative Random Forest and Bidirectional Encoder Representations from Transformers achieved the highest F1 score and accuracy. The accuracy and F1 score of aspect detection for the Cricket dataset were 0.89 and 0.85, respectively, and for the Restaurant dataset were 0.92 and 0.89 respectively.
m
BanglaEmotion: A Benchmark Dataset for Bangla Textual Emotion Analysis
data.mendeley.com
Updated Nov 20, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Md Ataur Rahman (2020). BanglaEmotion: A Benchmark Dataset for Bangla Textual Emotion Analysis [Dataset]. http://doi.org/10.17632/24xd7w7dhp.1
Explore at:
Unique identifier
https://doi.org/10.17632/24xd7w7dhp.1
Dataset updated
Nov 20, 2020
Authors
Md Ataur Rahman
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We present a manually annotated Bangla Emotion corpus, which incorporates the diversity of fine-grained emotion expressions in social-media text. We tried to consider more fine-grained emotion labels such as Sadness, Happiness, Disgust, Surprise, Fear and Anger - which are, according to Paul Ekman (1999), the six basic emotion categories. For this task, we collected a large amount of raw text data from the user’s comments on two different Facebook groups (Ekattor TV and Airport Magistrates) and from the public post of a popular blogger and activist Dr. Imran H Sarker. These comments are mostly reactions to ongoing socio-political issues and towards the economic success and failure of Bangladesh. We scrape a total of 32923 comments from the three sources aforementioned above. Out of these, a total of 6314 comments were annotated into the six categories. The distribution of the annotated corpus is as follows:

sad = 1341 happy = 1908 disgust = 703 surprise = 562 fear = 384 angry = 1416

We have also provided a balanced set from the above data and split the dataset into training and test set of equal ratio. We considered a proportion of 5:1 for training and evaluation purpose. More information on the dataset and the experiments on it could be found in our paper (related links below).
BanglaSER: Bangla Audio for Emotion Recognition
kaggle.com
Updated Aug 27, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Evil Spirit05 (2024). BanglaSER: Bangla Audio for Emotion Recognition [Dataset]. https://www.kaggle.com/datasets/evilspirit05/emotion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 27, 2024
Dataset provided by
Kaggle
Authors
Evil Spirit05
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
BanglaSER is a specialized dataset designed for the task of Bangla speech emotion recognition. This dataset includes a rich collection of speech-audio recordings that capture a variety of fundamental human emotions. It is curated to support research and development in the field of speech emotion recognition, particularly for the Bangla language, and is suitable for various deep learning architectures.

Dataset Composition:

Total Number of Recordings: 1,467

Number of Speakers: 34 (17 male and 17 female)

Age Range of Speakers: 19 to 47 years

Recording Devices: Smartphones and laptops

Emotional States Covered:

Angry

Happy

Neutral

Sad

Surprise

Recording Structure:

Each emotional state is represented by:

3 Statements spoken three times by each participant.

For Angry, Happy, Sad, and Surprise: 3 statements × 3 repetitions × 34 speakers = 1,224 recordings.

For Neutral: 3 statements × 3 repetitions × 27 speakers = 243 recordings

Key Features:

Balanced Representation:

The dataset is carefully balanced with an equal number of male and female participants, ensuring that the recordings reflect diverse voices and emotional expressions.

Emotions are evenly distributed across the dataset, providing a robust basis for training and evaluating emotion recognition models.

Realistic Recording Conditions:

Recordings are made using commonly available devices, such as smartphones and laptops, which helps in preserving the naturalistic quality of the audio.

The dataset reflects real-life acoustic environments, making it more applicable to real-world applications.

Deep Learning Compatibility:

BanglaSER is designed to be compatible with various deep learning architectures, including Convolutional Neural Networks (CNNs), Long Short-Term Memory Networks (LSTMs), and Bidirectional LSTMs (BiLSTMs).

The dataset can be used for a range of tasks, from emotion classification to sentiment analysis, and more.

Usage and Applications:

Emotion Recognition Models: BanglaSER provides a diverse set of recordings that are ideal for training models to recognize and classify emotions in Bangla speech.

Benchmarking and Evaluation: The dataset serves as a benchmark for evaluating the performance of emotion recognition systems and can help in comparing different model architectures and techniques.

Research and Development: Researchers can use BanglaSER to explore new methods in speech emotion recognition, develop novel algorithms, and enhance the understanding of emotion in speech.

Dataset Access:

Download Link: https://data.mendeley.com/datasets/t9h6p943xy/5

Documentation: Detailed documentation and guidelines for using the dataset are provided to assist users in effectively leveraging the data.

Acknowledgments:

We extend our gratitude to the contributors and participants who made this dataset possible. Their efforts have greatly enriched the field of speech emotion recognition and provided valuable resources for the community. Feel free to explore the dataset and utilize it in your research and projects. We look forward to seeing the innovative applications and advancements that will emerge from the use of BanglaSER
h
BanglaBook
huggingface.co
Updated Jul 17, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Syed Rifat Raiyan (2024). BanglaBook [Dataset]. https://huggingface.co/datasets/Starscream-11813/BanglaBook
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 17, 2024
Authors
Syed Rifat Raiyan
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
BᴀɴɢʟᴀBᴏᴏᴋ: A Large-scale Bangla Dataset for Sentiment Analysis from Book Reviews

This repository contains the code, data, and models of the paper titled "BᴀɴɢʟᴀBᴏᴏᴋ: A Large-scale Bangla Dataset for Sentiment Analysis from Book Reviews" published in the Findings of the Association for Computational Linguistics: ACL 2023.

License: Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International

Data Format

Each row consists of a book review sample. The… See the full description on the dataset page: https://huggingface.co/datasets/Starscream-11813/BanglaBook.
h
Bengali_Cyberbullying_Detection_Comments_Dataset
huggingface.co
Updated Oct 3, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Faisal Ahmed (2024). Bengali_Cyberbullying_Detection_Comments_Dataset [Dataset]. https://huggingface.co/datasets/faisalahmed/Bengali_Cyberbullying_Detection_Comments_Dataset
Explore at:
Dataset updated
Oct 3, 2024
Authors
Faisal Ahmed
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains 44,001 Bengali comments, curated to detect cyberbullying using Natural Language Processing (NLP) techniques. Each comment is labeled by experts, categorizing different forms of harassment and offensive behavior. The dataset enables the identification of inappropriate content, ranging from mild to severe harassment, facilitating precise classification and analysis. This resource is designed for researchers and developers working on cyberbullying detection, sentiment… See the full description on the dataset page: https://huggingface.co/datasets/faisalahmed/Bengali_Cyberbullying_Detection_Comments_Dataset.
m
Bangla Speech Personality Traits Data
data.mendeley.com
Updated Apr 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Md. Sajeebul Islam Sk. (2024). Bangla Speech Personality Traits Data [Dataset]. http://doi.org/10.17632/fb6dm3yb6m.1
Explore at:
Unique identifier
https://doi.org/10.17632/fb6dm3yb6m.1
Dataset updated
Apr 17, 2024
Authors
Md. Sajeebul Islam Sk.
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The different prominences of the collected data are given shortly: Data Types: Audio Data size: 1750 (speech-text-phase), 1000 (speech-base-phase) Linguistic diversity: Short text Audio capturing quality: 44.1 KHz, Mono

We created a new personality traits dataset for our research work because there is a noticeable absence of datasets for automatically assessing personality from Bangla Speech. This data, processed with Machine Learning models, demonstrated that different personality produce varying magnitudes at different frequencies, exhibiting distinct patterns.

Facebook

Twitter

Click to copy link

Link copied

Cite

Mahadi Hassan (2024). Bengali-E-commerce-sentiments [Dataset]. https://huggingface.co/datasets/Mahadih534/Bengali-E-commerce-sentiments

Bengali-E-commerce-sentiments

Mahadih534/Bengali-E-commerce-sentiments

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Aug 31, 2024

Authors

Mahadi Hassan

License

Attribution 2.5 (CC BY 2.5)https://creativecommons.org/licenses/by/2.5/
License information was derived automatically

Description

Mahadih534/Bengali-E-commerce-sentiments dataset hosted on Hugging Face and contributed by the HF Datasets community

Clear search

Close search

Google apps

Main menu

Bengali-E-commerce-sentiments

Data Set For Sentiment Analysis On Bengali News Comments

Bangla (Bengali) Drama Review Dataset

roots_indic-bn_bangla_sentiment_classification_datasets

Bangla-Sentiment-Analysis

Context

Content

Acknowledgements

Inspiration

Bangla Financial lexicon Sentiment dictionary

Bengali Identity Bias Evaluation Dataset (BIBED)

synthetic-bengali-sentiment

Motamot: A Dataset for Revealing the Supremacy of Large Language Models over...

Data from: Twitter corpus of Resource-Scarce Languages for Sentiment...

Bengali Fake News Dataset

Key Features:

Attributes:

Data Processing:

Applications:

Bengali Hate Speech

BanglaBlend: A Large-Scale Nobel Dataset of Bangla Sentences Categorized by...

SentNoB Dataset

Aspect detection for restaurant dataset.

BanglaEmotion: A Benchmark Dataset for Bangla Textual Emotion Analysis

BanglaSER: Bangla Audio for Emotion Recognition

Dataset Composition:

Emotional States Covered:

Recording Structure:

Each emotional state is represented by:

Key Features:

Balanced Representation:

Realistic Recording Conditions:

Deep Learning Compatibility:

Usage and Applications:

Dataset Access:

Download Link: https://data.mendeley.com/datasets/t9h6p943xy/5

Acknowledgments:

BanglaBook

Bengali_Cyberbullying_Detection_Comments_Dataset

Bangla Speech Personality Traits Data

Bengali-E-commerce-sentiments

Mahadih534/Bengali-E-commerce-sentiments