GoEmotions is a corpus of 58k carefully curated comments extracted from Reddit, with human annotations to 27 emotion categories or Neutral. Number of examples: 58,009. Number of labels: 27 + Neutral. Maximum sequence length in training and evaluation datasets: 30. On top of the raw data, the dataset also includes a version filtered based on reter-agreement, which contains a train/test/validation split: Size of training dataset: 43,410. Size of test dataset: 5,427. Size of validation dataset: 5,426. The emotion categories are: admiration, amusement, anger, annoyance, approval, caring, confusion, curiosity, desire, disappointment, disapproval, disgust, embarrassment, excitement, fear, gratitude, grief, joy, love, nervousness, optimism, pride, realization, relief, remorse, sadness, surprise.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Google AI dataset for sentiment / emotions analysis
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This dataset is a translation of the Google GoEmotions emotion classification dataset. All features remain unchanged, except for the addition of a new ru_text column containing the translated text in Russian. For the translation process, I used the Deep translator with the Google engine. You can find all the details about translation, raw .csv files and other stuff in this Github repository. For more information also check the official original dataset card.… See the full description on the dataset page: https://huggingface.co/datasets/seara/ru_go_emotions.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Card for GoEmotions
Dataset Summary
The GoEmotions dataset contains 58k carefully curated Reddit comments labeled for 27 emotion categories or Neutral. The raw data is included as well as the smaller, simplified version of the dataset with predefined train/val/test splits.
Supported Tasks and Leaderboards
This dataset is intended for multi-class, multi-label emotion classification.
Languages
The data is in English and… See the full description on the dataset page: https://huggingface.co/datasets/antoniomenezes/go_emotions_ptbr.
This dataset was created by Orientino
This dataset was created by Enes Ozturk
GoEmotions is a human-annotated dataset of 58k Reddit observations. It is labeled with 27 emotion categories (12 positive, 11 negated, 4 ambiguous, and “neutral”), make it widely suitable for conversation understands tasks that require a discreet differentiation between feel expressions.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset contains 58K carefully curated Reddit comments labeled for 27 emotion categories: admiration, amusement, anger, annoyance, approval, caring, confusion, curiosity, desire, disappointment, disapproval, disgust, embarrassment, excitement, fear, gratitude, grief, joy, love, nervousness, optimism, pride, realization, relief, remorse, sadness, & surprise.
GoEmotions is a human-annotated dataset of 58k Reddit comments. This is labeled with 27 emotion categories (12 positive, 11 negative, 4 ambiguous, and “neutral”), making it widely suitable for conversation understanding job that require a cunning differentiation between emotion expressions.
GoEmotions is a human-annotated dataset of 58k Reddit comments. It is labeled with 27 emotion classifications (12 positive, 11 negative, 4 ambiguous, and “neutral”), take it widely suitable for conversation understanding tasks that requiring a subtle differentiation bets emotion expressions.
AutoTrain Dataset for project: twitter-goemotions-binary-fear-classification
Dataset Description
This dataset has been automatically processed by AutoTrain for project twitter-goemotions-binary-fear-classification.
Languages
The BCP-47 code for the dataset's language is unk.
Dataset Structure
Data Instances
A sample from this dataset looks as follows: [ { "text": "Downvoting comments you don't like is your right."… See the full description on the dataset page: https://huggingface.co/datasets/garrettbaber/twitter-roberta-goemotions-binary-fear-classification.
GoEmotions is a human-annotated dataset in 58k Reddit commentaries. It is marked with 27 sentiment categories (12 positive, 11 negative, 4 ambiguous, and “neutral”), making she verbreitet suitable for conversation understanding tasks that require a subtle differentiation between affect expressions.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Dataset released by Google with text and the emotions detected in those texts
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by AmanShukla111
Released under Apache 2.0
tgelton/GoEmotions dataset hosted on Hugging Face and contributed by the HF Datasets community
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Humans' emotional perception is subjective by nature, in which each individual could express different emotions regarding the same textual content. Existing datasets for emotion analysis commonly depend on a single ground truth per data sample, derived from majority voting or averaging the opinions of all annotators. We introduce a new non-aggregated dataset, namely StudEmo, that contains 5,182 customer reviews, each annotated by 25 people with intensities of eight emotions from Plutchik's model, extended with valence and arousal. We also propose three personalized models that use not only textual content but also the individual human perspective, providing the model with different approaches to learning human representations. The experiments were carried out as a multitask classification on two datasets: our StudEmo dataset and GoEmotions dataset, which contains 28 emotional categories. The proposed personalized methods significantly improve prediction results, especially for emotions that have low inter-annotator agreement.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
GoEmotions Spanish
A Spanish translation (using EasyNMT) of the GoEmotions dataset.
For more information check the official Model Card
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Ethics reference is: 2020/2144326527-03
The aim of this research is to power chatbots with algorithms that can determine a potential buyer from customers’ chats to offer them a sale. To reach our goal, detecting the potential customer from the chat is the main challenge that we have to overcome. Discovering emotions from chat will direct us to understand more about customers’ intention to purchase or accept an offer. Experimental (empirical) research is defined as data-based research which relays on experiments or observations. Moreover, in experimental research, a verifiable conclusion should be generated by the researcher. Therefore, we developed a hypothesis and established an experimental design to prove or disprove it. The Null Hypothesis (H0): There is no relation between user emotion to their online buying decision-making. The Alternative Hypothesis (H1): User emotions play a significant role in online purchasing decision-making. To prove or disprove this hypothesis, experimental research with a positive approach has been designed. The goal of this experimental research is to find out whether there is a relation between users’ emotions and their purchasing decision-making process. We found four datasets that are labelled with emotion tags and then filtered them based on the conversation about purchasing (both accepting and declining purchases).
4 datasets as well as references which have been used to test hypothesis on this research: EmotionLines: Dialogues extracted from the Friends TV Series are labelled by Basic emotion: Anger, Disgust, Fear, Happiness, Sadness, and Surprise. The dialogue emotions were identified by humans in a survey. LREC 2018 - 11th International Conference on Language Resources and Evaluation Chen, S. Y., Hsu, C. C., Kuo, C. C., Huang, T. H. K., & Ku, L. W. (2019). Emotionlines: An emotion corpus of multi-party conversations. , 1597–1601.
CARER:Tweets extracted from the tweeter. They are in English Language and their emotions were identified by their authors' given hashtags. Emotions are Anger Anticipation, Disgust, Fear, Joy, Sadness, Surprise, and Trust. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018 Saravia, E., Toby Liu, H. C., Huang, Y. H., Wu, J., & Chen, Y. S. (2018). Carer: Contextualized affect representations for emotion recognition. , 3687–3697. https://doi.org/10.18653/v1/d18-1404
EmotionPush:Messages are extracted from Facebook Messenger with 7 emotions: Joy, Anticipation, neutral, tired, anger, fear, and sadness 2018 IEEE Global Communications Conference, GLOBECOM 2018 - Proceedings Huang, C. Y., & Ku, L. W. (2018). EmotionPush: Emotion and Response Time Prediction Towards Human-Like Chatbots. . https://doi.org/10.1109/GLOCOM.2018.8647331
GoEmotions:The datasets are extracted from Reddit comments based on 27 emotions. GoEmotions: A Dataset of Fine-Grained Emotions Demszky, D., Movshovitz-Attias, D., Ko, J., Cowen, A., Nemade, G., & Ravi, S. (2020). . 4040–4054. https://doi.org/10.18653/v1/2020.acl-main.372
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Card for GoEmotions
Dataset Summary
The RuGoEmotions dataset contains 34k Reddit comments labeled for 9 emotion categories (joy, interest, surprice, sadness, anger, disgust, fear, guilt and neutral). The dataset already with predefined train/val/test splits
Supported Tasks and Leaderboards
This dataset is intended for multi-class, multi-label emotion classification.
Languages
The data is in Russian.
Dataset… See the full description on the dataset page: https://huggingface.co/datasets/Djacon/ru_goemotions.
Source:
This dataset is a machine translated version of the Many Emotions dataset available here: https://huggingface.co/datasets/ma2za/many_emotions It was translated into Finnish using DeepL: https://www.deepl.com/translator The Many Emotions dataset itself is a combination of three other emotion annotated datasets. These datasets are:
Daily Dialog: https://huggingface.co/datasets/daily_dialog GoEmotions: https://huggingface.co/datasets/go_emotions Emotion:… See the full description on the dataset page: https://huggingface.co/datasets/TurkuNLP/many_emotions_finnish.
GoEmotions is a corpus of 58k carefully curated comments extracted from Reddit, with human annotations to 27 emotion categories or Neutral. Number of examples: 58,009. Number of labels: 27 + Neutral. Maximum sequence length in training and evaluation datasets: 30. On top of the raw data, the dataset also includes a version filtered based on reter-agreement, which contains a train/test/validation split: Size of training dataset: 43,410. Size of test dataset: 5,427. Size of validation dataset: 5,426. The emotion categories are: admiration, amusement, anger, annoyance, approval, caring, confusion, curiosity, desire, disappointment, disapproval, disgust, embarrassment, excitement, fear, gratitude, grief, joy, love, nervousness, optimism, pride, realization, relief, remorse, sadness, surprise.