Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Bangla ( Bengali ) sentiment analysis dataset
The repository contains 3307 Negative reviews and 8500 Positive reviews collected and manually annotated from Youtube Bengali drama.
If you use this dataset, please cite the following paper-
@inproceedings{sazzed2020cross, title={Cross-lingual sentiment classification in low-resource Bengali language}, author={Sazzed, Salim}, booktitle={Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020)}, pages={50--60}, year={2020} }
If you have any questions, please email me- salimsazzad222@gmail.com.
https://www.apache.org/licenses/LICENSE-2.0https://www.apache.org/licenses/LICENSE-2.0
This dataset basically identify the comment is the comment negative or positive.
Data Set For Sentiment Analysis On Bengali News Comments
This is a data set of Sentiment Analysis On Bangla News Comments where every data was annotated by three different individuals to get three different perspectives and based on the majorities decisions the final tag was chosen. This data set contains 13802 data in total.
https://data.mendeley.com/datasets/n53xt69gnf/2
aiming to improve bengali and romanic bangla nlp works
The repository contains 3307 Negative reviews and 8500 Positive reviews collected and manually annotated from Youtube Bengali drama. If you use this dataset, please cite the following paper- @inproceedings{sazzed2020cross, title={Cross-lingual sentiment classification in low-resource Bengali language}, author={Sazzed, Salim}, booktitle={Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020)}, pages={50--60}, year={2020} } If you have any questions, please email me- salimsazzad222@gmail.com.
Please cite the paper if you use the dataset or lexicon
If you use this dataset, please cite the following paper- @inproceedings{sazzed2020cross, title={Cross-lingual sentiment classification in low-resource Bengali language}, author={Sazzed, Salim}, booktitle={Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020)}, pages={50--60}, year={2020} } If you have any questions, please email me- salimsazzad222@gmail.com.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Bangla ( Bengali ) sentiment analysis dataset
This is a data set of Sentiment Analysis On Bangla News Comments where every data was annotated by three different individuals to get three different perspectives and based on the majorities decisions the final tag was chosen. This data set contains 13802 data in total.
This dataset was created by Nuhash Afnan
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset presents the news articles published in a renowned Bengali YouTube news channel along with the public comments, replies, and other corresponding information. There are 7,62,678 samples of data with 15 features. The features include video URL, title of the news, likes in the video, video views, publishing date, hashtags, video description, comments with corresponding likes, and replies with likes. To ensure the privacy of the commentators, their names have been encoded.
This dataset contains around 1300 positive and negative Bengal ( Bangla ) sentiment words. This lexicon was created from a Bengali review corpus. If you use this lexicon please cite following paper- @inproceedings{sazzed2020development, title={Development of Sentiment Lexicon in Bengali utilizing Corpus and Cross-lingual Resources}, author={Sazzed, Salim}, booktitle={2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science (IRI)}, pages={237--244}, year={2020}, organization={IEEE Computer Society} } https://www.cs.odu.edu/~ssazzed/IEEE_IRI_2020.pdf
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Bangla Product Comments Dataset is a comprehensive collection of product reviews gathered from diverse ecommerce platforms in Bangladesh. This dataset offers a rich source of information reflecting customer opinions and sentiments towards various products available online. This dataset holds significant value for businesses, researchers, and data scientists interested in understanding consumer behavior, product perception, and sentiment analysis within the Bangladeshi ecommerce landscape. By leveraging this dataset, stakeholders can derive actionable insights to enhance product quality, marketing strategies, and overall customer satisfaction. Columns:1. Product_ID: A unique identifier for each product, facilitating organization and referencing.2. Date: The date when the comment was posted, providing temporal context for analysis.3. Customer Name: The name or identifier of the customer who submitted the comment, ensuring traceability and potential user segmentation.4. Rating: A numerical representation (typically on a scale of 1 to 5) reflecting the customer's overall satisfaction level with the product.5. Label Sentiment: A categorical label assigned to each comment indicating the sentiment expressed by the customer (e.g., positive, negative). This classification facilitates sentiment analysis tasks.6. Comment: The actual text of the customer's review or comment, conveying specific opinions, feedback, or experiences regarding the product.
This dataset contains 1443 Bangla book reviews. Among them 471 reviews are annotated as negative sentiment and 972 reviews are labelled as positive sentiment. All the reviews are collected from different online book shops and social media groups. The reviews are manually annotated by two native Bengali speakers. Though, the dataset is relatively small but it can be used for learning as well as research purpose.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
ROOTS Subset: roots_indic-bn_bangla_sentiment_classification_datasets
Bangla Sentiment Classification Datasets
Dataset uid: bangla_sentiment_classification_datasets
Description
Multiple sentiment classification datasets for Bengali, which can also be used for training LMs. The Datasets are the following: ABSA_datasets -- This dataset has developed to perform aspect based sentiment analysis task in Bangla. License: CC BY 4.0 SAIL_data -- This dataset, consists… See the full description on the dataset page: https://huggingface.co/datasets/bigscience-data/roots_indic-bn_bangla_sentiment_classification_datasets.
Word and Doc2Vec file for Bengali Sentiment Analysis
This dataset is used in Multilabel sentiment analysis and emotion detection for YouTube comments in different kinds of Bengali videos.
There are two files in the folder. There are might be multiple comments with same text. Also it may be noted that, the comments collected here contain abusive and vulgar words, slangs and personal attack. Therefore, we ensure that all annotators are adults.
Sentiment.csv
Id - Unique id number for the comment. Text - Text of the data Label - 1 (3 class label) or 2 (5 class label) Score - Denotes the polarity of the comment. In three class labelling : 1(positive), 0 (neutral), -1(negative) In three class labelling : 2 (highly positive), 1(positive), 0 (neutral), -1(negative), -2(highly negative) Lan - Language of the comment. EN (English), BN (Bengali), RN (Romanized Bangla) Domain - Category of the video.
Emotion.csv
Id - Unique id number for the comment. Text - Text of the data emotion - Corresponding emotion of the comment. Anger/Joy/Disgust/Fear/Surprise/Sad/None (no emotion found) Lan - Language of the comment. EN (English), BN (Bengali), RN (Romanized Bangla) Domain - Category of the video.
If you use the dataset in any research work, please cite the following paper as
N. Irtiza Tripto and M. Eunus Ali, "Detecting Multilabel Sentiment and Emotions from Bangla YouTube Comments," 2018 International Conference on Bangla Speech and Language Processing (ICBSLP), Sylhet, 2018, pp. 1-6.
doi: 10.1109/ICBSLP.2018.8554875
It will be helpful for researchers specially in analyzing sentiments from social media in non-English language
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is created by leveraging the social media platforms such as twitter for developing corpus across multiple languages. The corpus creation methodology is applicable for resource-scarce languages provided the speakers of that particular language are active users on social media platforms. We present an approach to extract social media microblogs such as tweets (Twitter). We created corpus for multilingual sentiment analysis and emoji prediction in Hindi, Bengali and Telugu. Further, we perform and analyze multiple NLP tasks utilizing the corpus to get interesting observations.
2-2325: From Twitter datasets May-November 2013 2326-16127: From http://dx.doi.org/10.17632/n53xt69gnf.3
What's inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too.
We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.
Your data will be in front of the world's largest data science community. What questions do you want to see answered?
Introduces three datasets of expressing hate, commonly used topics, and opinions for hate speech detection, document classification, and sentiment analysis, respectively.
This dataset was created by Tazim H
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Bangla ( Bengali ) sentiment analysis dataset
The repository contains 3307 Negative reviews and 8500 Positive reviews collected and manually annotated from Youtube Bengali drama.
If you use this dataset, please cite the following paper-
@inproceedings{sazzed2020cross, title={Cross-lingual sentiment classification in low-resource Bengali language}, author={Sazzed, Salim}, booktitle={Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020)}, pages={50--60}, year={2020} }
If you have any questions, please email me- salimsazzad222@gmail.com.