47 datasets found

h
goemotions
huggingface.co
Updated Aug 12, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Manuel Romero (2023). goemotions [Dataset]. https://huggingface.co/datasets/mrm8488/goemotions
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 12, 2023
Authors
Manuel Romero
Description
GoEmotions

GoEmotions is a corpus of 58k carefully curated comments extracted from Reddit, with human annotations to 27 emotion categories or Neutral.

Number of examples: 58,009. Number of labels: 27 + Neutral. Maximum sequence length in training and evaluation datasets: 30.

On top of the raw data, we also include a version filtered based on reter-agreement, which contains a train/test/validation split:

Size of training dataset: 43,410. Size of test dataset: 5,427. Size of… See the full description on the dataset page: https://huggingface.co/datasets/mrm8488/goemotions.
T
goemotions
tensorflow.org
opendatalab.com
+3more
Updated Dec 6, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). goemotions [Dataset]. https://www.tensorflow.org/datasets/catalog/goemotions
Explore at:
Dataset updated
Dec 6, 2022
Description
The GoEmotions dataset contains 58k carefully curated Reddit comments labeled for 27 emotion categories or Neutral. The emotion categories are admiration, amusement, anger, annoyance, approval, caring, confusion, curiosity, desire, disappointment, disapproval, disgust, embarrassment, excitement, fear, gratitude, grief, joy, love, nervousness, optimism, pride, realization, relief, remorse, sadness, surprise.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('goemotions', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.
h
go_emotions_ptbr
huggingface.co
kaggle.com
Updated Aug 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Antonio Marcio Adiodato de Menezes (2023). go_emotions_ptbr [Dataset]. https://huggingface.co/datasets/antoniomenezes/go_emotions_ptbr
Explore at:
Dataset updated
Aug 14, 2023
Authors
Antonio Marcio Adiodato de Menezes
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset Card for GoEmotions

Dataset Summary

The GoEmotions dataset contains 58k carefully curated Reddit comments labeled for 27 emotion categories or Neutral. The raw data is included as well as the smaller, simplified version of the dataset with predefined train/val/test splits.

Supported Tasks and Leaderboards

This dataset is intended for multi-class, multi-label emotion classification.

Languages

The data is in English and Brazilian Portuguese… See the full description on the dataset page: https://huggingface.co/datasets/antoniomenezes/go_emotions_ptbr.

GoEmotions (UA) – Emotion Classification Dataset

kaggle.com

zip

Updated Nov 30, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

Oleksii Chumak (2025). GoEmotions (UA) – Emotion Classification Dataset [Dataset]. https://www.kaggle.com/datasets/oleksiichumak/goemotions-ua-emotion-classification-dataset

Explore at:

zip(4527621 bytes)Available download formats

Dataset updated

Nov 30, 2025

Authors

Oleksii Chumak

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

GoEmotions Ukrainian Dataset

Ukrainian translation of the GoEmotions dataset for emotion classification in text.

Dataset Description

This dataset is a high-quality Ukrainian translation of Google's GoEmotions dataset, which contains Reddit comments labeled with 28 emotion categories.

Translation Methodology

Model: Helsinki-NLP/opus-mt-en-uk - specialized English-Ukrainian translation model
Post-processing: Manual quality tuning and refinement to ensure natural Ukrainian phrasing
Quality: 100% Ukrainian text with natural, context-aware translations

Dataset Statistics

Total samples: 54,263 Reddit comments
Language: Ukrainian (translated from English)
Emotion categories: 28 + neutral
Splits: Train (43,410), Validation (5,426), Test (5,427)
Task type: Multi-label classification (texts can have multiple emotions)

Emotion Categories

The dataset includes 28 emotion categories:

Category	Ukrainian	Category	Ukrainian
admiration	захоплення	amusement	розвага
anger	гнів	annoyance	роздратування
approval	схвалення	caring	турбота
confusion	розгубленість	curiosity	цікавість
desire	бажання	disappointment	розчарування
disapproval	несхвалення	disgust	відраза
embarrassment	збентеження	excitement	збудження
fear	страх	gratitude	вдячність
grief	горе	joy	радість
love	любов	nervousness	нервозність
optimism	оптимізм	pride	гордість
realization	усвідомлення	relief	полегшення
remorse	каяття	sadness	сум
surprise	здивування	neutral	нейтрально

File Structure

CSV Format

The dataset is provided in CSV format with the following columns:

text,text_uk,labels,id,split

text: Original English text
text_uk: Ukrainian translation
labels: List of emotion label indices (0-27, multi-label)
id: Unique identifier
split: Data split (train/validation/test)

Example

text,text_uk,labels,id,split
"My favourite food is anything I didn't have to cook myself.","Моя улюблена їжа - це все, що я не мусив сам готувати.",[27],eebbqej,train

Usage

Loading the Dataset

import pandas as pd

# Load dataset
df = pd.read_csv('goemotions_uk.csv')

# Parse labels
import ast
df['labels'] = df['labels'].apply(ast.literal_eval)

# Split by data split
train_df = df[df['split'] == 'train']
val_df = df[df['split'] == 'validation']
test_df = df[df['split'] == 'test']

Multi-label Classification

from sklearn.preprocessing import MultiLabelBinarizer

# Convert labels to multi-hot encoding
mlb = MultiLabelBinarizer(classes=list(range(28)))
mlb.fit([list(range(28))])

train_labels = mlb.transform(train_df['labels'])
val_labels = mlb.transform(val_df['labels'])
test_labels = mlb.transform(test_df['labels'])

With Transformers

from transformers import AutoTokenizer, AutoModelForSequenceClassification

# Use multilingual models
model_name = "xlm-roberta-base" # or "TurkuNLP/bert-base-ukrainian-cased"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(
  model_name, 
  num_labels=28,
  problem_type="multi_label_classification"
)

# Tokenize Ukrainian text
encodings = tokenizer(
  train_df['text_uk'].tolist(),
  truncation=True,
  padding=True,
  max_length=128
)

Applications

This dataset is suitable for:

Emotion detection in Ukrainian social media and text
Sentiment analysis with fine-grained emotional categories
Multi-label text classification research
Ukrainian NLP model development and evaluation
Cross-lingual emotion recognition studies

Citation

If you use this dataset, please cite the original GoEmotions paper:

@inproceedings{demszky2020goemotions,
 title={{GoEmotions: A Dataset of Fine-Grained Emotions}},
 author={Demszky, Dorottya and Movshovitz-Attias, Dana and Ko, Jeongwoo and Cowen, Alan and Nemade, Gaurav and Ravi, Sujith},
 booktitle={58th Annual Meeting of the Association for Computational Linguistics (ACL)},
 year={2020}
}

Lice...

h
goemotions-5point-sentiment
huggingface.co
Updated Mar 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jose (2025). goemotions-5point-sentiment [Dataset]. https://huggingface.co/datasets/spacesedan/goemotions-5point-sentiment
Explore at:
Dataset updated
Mar 23, 2025
Authors
Jose
Description
GoEmotions 5-Point Sentiment Dataset

This dataset is a modified version of the GoEmotions dataset created by Google. The original dataset consists of 58k carefully curated Reddit comments labeled with 27 fine-grained emotion categories plus a neutral label.

📘 About This Version

This version maps the original GoEmotions emotion labels into a 5-point sentiment scale, making it more suitable for traditional sentiment analysis tasks:

Original Label(s) Mapped Sentiment… See the full description on the dataset page: https://huggingface.co/datasets/spacesedan/goemotions-5point-sentiment.
Go Emotion Dataset
kaggle.com
zip
Updated Jul 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Akhil Vibhakar (2023). Go Emotion Dataset [Dataset]. https://www.kaggle.com/datasets/akhilvibhakar/go-emotion-dataset
Explore at:
zip(9100876 bytes)Available download formats
Dataset updated
Jul 19, 2023
Authors
Akhil Vibhakar
Description
The is Google's GoEmotions dataset, which contains 27 categories of emotions on 56k English Reddit comments.
h
go-emotions-cleaned
huggingface.co
Updated Oct 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Keyur Jotaniya (2025). go-emotions-cleaned [Dataset]. https://huggingface.co/datasets/Keyurjotaniya007/go-emotions-cleaned
Explore at:
Dataset updated
Oct 17, 2025
Authors
Keyur Jotaniya
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset Summary

The GoEmotions Cleaned dataset is a refined version of the original Google GoEmotions dataset. It has been cleaned, simplified, and reformatted for use in text classification tasks such as emotion detection, sentiment analysis, and multi-label emotion prediction. This version retains only two essential columns — text and label — making it ideal for model fine-tuning and experimentation with Transformer-based architectures.

Dataset Structure… See the full description on the dataset page: https://huggingface.co/datasets/Keyurjotaniya007/go-emotions-cleaned.
E
GoEmotions
live.european-language-grid.eu
csv
Updated Dec 30, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2020). GoEmotions [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/5011
Explore at:
csvAvailable download formats
Dataset updated
Dec 30, 2020
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset contains 58K carefully curated Reddit comments labeled for 27 emotion categories: admiration, amusement, anger, annoyance, approval, caring, confusion, curiosity, desire, disappointment, disapproval, disgust, embarrassment, excitement, fear, gratitude, grief, joy, love, nervousness, optimism, pride, realization, relief, remorse, sadness, & surprise.
GoEmotions Dataset1
kaggle.com
zip
Updated Jul 8, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Enes Ozturk (2023). GoEmotions Dataset1 [Dataset]. https://www.kaggle.com/datasets/enesztrk/goemotions-dataset
Explore at:
zip(5339801 bytes)Available download formats
Dataset updated
Jul 8, 2023
Authors
Enes Ozturk
Description
Dataset

This dataset was created by Enes Ozturk

Contents
h
goemotion-ekman-emotions
huggingface.co
Updated Aug 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
moodlogue (2025). goemotion-ekman-emotions [Dataset]. https://huggingface.co/datasets/Frankhihi/goemotion-ekman-emotions
Explore at:
Dataset updated
Aug 2, 2025
Authors
moodlogue
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
GoEmotions Ekman Emotions Dataset

Dataset Description

This dataset contains 10,000 text samples from Reddit comments mapped to the 7 basic Ekman emotions. It's derived from the original GoEmotions dataset and processed specifically for emotion classification research using Paul Ekman's fundamental emotion model.

Supported Tasks

Text Classification: Multi-class emotion classification Sentiment Analysis: Fine-grained emotion detection Psychology Research:… See the full description on the dataset page: https://huggingface.co/datasets/Frankhihi/goemotion-ekman-emotions.
h
ru_go_emotions
huggingface.co
Updated Aug 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vyacheslav Litvinov (2023). ru_go_emotions [Dataset]. https://huggingface.co/datasets/seara/ru_go_emotions
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 26, 2023
Authors
Vyacheslav Litvinov
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Description

This dataset is a translation of the Google GoEmotions emotion classification dataset. All features remain unchanged, except for the addition of a new ru_text column containing the translated text in Russian. For the translation process, I used the Deep translator with the Google engine. You can find all the details about translation, raw .csv files and other stuff in this Github repository. For more information also check the official original dataset card.… See the full description on the dataset page: https://huggingface.co/datasets/seara/ru_go_emotions.
f
Comparison of expanded experimental results.
figshare.com
plos.figshare.com
xls
Updated Nov 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jingyi Zhou; Senlin Luo; Haofan Chen (2025). Comparison of expanded experimental results. [Dataset]. http://doi.org/10.1371/journal.pone.0333930.t005
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0333930.t005
Dataset updated
Nov 13, 2025
Dataset provided by
PLOS ONE
Authors
Jingyi Zhou; Senlin Luo; Haofan Chen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Textemotion detection constitutes a crucial foundation for advancing artificial intelligence from basic comprehension to the exploration of emotional reasoning. Most existing emotion detection datasets rely on manual annotations, which are associated with high costs, substantial subjectivity, and severe label imbalances. This is particularly evident in the inadequate annotation of micro-emotions and the absence of emotional intensity representation, which fail to capture the rich emotions embedded in sentences and adversely affect the quality of downstream task completion. By proposing an all-labels and training-set label regression method, we map label values to energy intensity levels, thereby fully leveraging the learning capabilities of machine models and the interdependencies among labels to uncover multiple emotions within samples. This led to the establishment of the Emotion Quantization Network (EQN) framework for micro-emotion detection and annotation. Using five commonly employed sentiment datasets, we conducted comparative experiments with various models, validating the broad applicability of our framework within NLP machine learning models. Based on the EQN framework, emotion detection and annotation are conducted on the GoEmotions dataset. A comprehensive comparison with the results from its literature demonstrates that the EQN framework possesses a high capability for automatic detection and annotation of micro-emotions. The EQN framework is the first to achieve automatic micro-emotion annotation with energy-level scores, providing strong support for further emotion detection analysis and the quantitative research of emotion computing.
Go Emotions: Google Emotions Dataset
kaggle.com
zip
Updated Nov 17, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shivam Bansal (2021). Go Emotions: Google Emotions Dataset [Dataset]. https://www.kaggle.com/datasets/shivamb/go-emotions-google-emotions-dataset/code
Explore at:
zip(9100876 bytes)Available download formats
Dataset updated
Nov 17, 2021
Authors
Shivam Bansal
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
The Google AI GoEmotions dataset consists of comments from Reddit users with labels of their emotional coloring. GoEmotions is designed to train neural networks to perform deep analysis of the tonality of texts. Most of the existing emotion classification datasets cover certain areas (for example, news headlines and movie subtitles), are small in size and use a scale of only six basic emotions (anger, surprise, disgust, joy, fear, and sadness). The expansion of the emotional spectrum considered in datasets could make it possible to create more sensitive chatbots, models for detecting dangerous behavior on the Internet, as well as improve customer support services.

The categories of emotions were identified by Google together with psychologists and include 12 positive,, 11 negative, 4 ambiguous emotions, and 1 neutral, which makes the dataset suitable for solving tasks that require subtle differentiation between different emotions.

Source: https://arxiv.org/pdf/2005.00547.pdf Github: https://github.com/google-research/google-research/tree/master/goemotions
h
en_go_emotions
huggingface.co
Updated Dec 15, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniels Buls (2015). en_go_emotions [Dataset]. https://huggingface.co/datasets/SkyWater21/en_go_emotions
Explore at:
Dataset updated
Dec 15, 2015
Authors
Daniels Buls
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Original dataset: GoEmotions dataset Added labels_ekman column with multi-label emotion annotations mapped to 7 base emotions as per Dr. Ekman theory. Column labels contains multi-label emotion annotations with 28 emotion labels as per GoEmotion dataset: 0: admiration 1: amusement 2: anger 3: annoyance 4: approval 5: caring 6: confusion 7: curiosity 8: desire 9: disappointment 10: disapproval 11: disgust 12: embarrassment 13: excitement 14: fear 15: gratitude 16: grief 17: joy 18: love 19:… See the full description on the dataset page: https://huggingface.co/datasets/SkyWater21/en_go_emotions.
Sentiment Analysis Dataset
kaggle.com
zip
Updated May 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mitesh (2025). Sentiment Analysis Dataset [Dataset]. https://www.kaggle.com/datasets/mgmitesh/sentiment-analysis-dataset/discussion
Explore at:
zip(16442713 bytes)Available download formats
Dataset updated
May 20, 2025
Authors
Mitesh
License
Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
Description
This dataset is designed for building and evaluating sentiment and emotion classification models in Natural Language Processing (NLP). It includes two well-known datasets:

GoEmotions: A fine-grained emotion dataset developed by Google, containing 58k English Reddit comments labeled with 27 emotion categories plus Neutral.

DailyDialog: A high-quality multi-turn dialog dataset with emotion and intent annotations, ideal for dialog modeling and conversational AI.

Each dataset is provided in CSV format and includes text samples along with corresponding emotion or sentiment labels.

This dataset is useful for:

Emotion classification and multi-label sentiment analysis.

Fine-tuning transformer models (e.g., BERT, RoBERTa).

Training empathetic conversational agents.

Research in affective computing and human-centered AI.
f
Examples of full Label method and labels represented by integers or one-hot...
figshare.com
xls
Updated Nov 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jingyi Zhou; Senlin Luo; Haofan Chen (2025). Examples of full Label method and labels represented by integers or one-hot encoding. [Dataset]. http://doi.org/10.1371/journal.pone.0333930.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0333930.t001
Dataset updated
Nov 13, 2025
Dataset provided by
PLOS ONE
Authors
Jingyi Zhou; Senlin Luo; Haofan Chen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Examples of full Label method and labels represented by integers or one-hot encoding.
GoEmotions-neutral-sadness-joy-anger
kaggle.com
zip
Updated May 27, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Taher Hasan (2024). GoEmotions-neutral-sadness-joy-anger [Dataset]. https://www.kaggle.com/datasets/taherhasan/goemotions-neutral-sadness-joy-anger/discussion?sort=undefined
Explore at:
zip(715940 bytes)Available download formats
Dataset updated
May 27, 2024
Authors
Taher Hasan
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset

This dataset was created by Taher Hasan

Released under Apache 2.0

Contents
h
goemotions-ekman
huggingface.co
Updated May 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jonas Bacci (2024). goemotions-ekman [Dataset]. https://huggingface.co/datasets/jonasbacci/goemotions-ekman
Explore at:
Dataset updated
May 1, 2024
Authors
Jonas Bacci
Description
jonasbacci/goemotions-ekman dataset hosted on Hugging Face and contributed by the HF Datasets community
f
Data and some code used in the paper:Expansion quantization network: A...
figshare.com
zip
Updated Oct 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zhou (2025). Data and some code used in the paper:Expansion quantization network: A micro-emotion detection and annotation framework [Dataset]. http://doi.org/10.6084/m9.figshare.30406315.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.30406315.v1
Dataset updated
Oct 21, 2025
Dataset provided by
figshare
Authors
Zhou
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The EQN framework is a micro-emotion annotation and detection system that realizes the automatic micro-emotion annotation of text with energy level scores for the first time. The text emotion datasets it annotates are no longer simple single-label or multi-label, but macro-emotions and micro-emotions with continuous values of emotion intensity. The labeling of emotion datasets has changed from discrete to continuous. It plays an important role in the subtle research of emotions in fields such as emotional computing, human-computer alignment, humanoid robots, and psychology.This is the experimental result of the EQN micro-emotion detection and annotation framework we proposed, the train.csv of the Goemotions dataset with micro-emotion labels with energy level intensity valuesand the model trained on the Goemotions dataset based on the BERT model. Attached is the micro-emotion annotation code based on pytorch, which can be used to annotate the Goemotions dataset by yourself, or predict the emotion classification based on the annotation results. For the specific implementation method, please refer to our paperNote:1. gotrainadd.csv: Goemotions dataset with additional annotation (micro-emotion labels with energy level intensity values(0-10)).2. 28pd.py: Micro-emotion detection and annotation code based on pytorch.3. 55770-1.pth: Model trained on the Goemotions dataset based on the BERT model (emotion energy level intensity is a value between 0-1).4. Goemotions dataset: Data and code available at https://github.com/google-research/google-research/tree/master/goemotionsThe experimental environment of this project.GPU：NVIDIA GeForce RTX 3090 GPUBert-base-cased pre-trained model: https://huggingface.co/google-bert/bert-base-casedpython=3.7，pytorch=1.9.0，cudatoolkit=11.3.1，cudnn=8.9.7.29.Instructions for use:1. Refer to our usage environment instructions and install the operating environment.2. Download our EQN-model.3. Change the loading model name in 28pd.py to the actual name of the downloaded EQN-model.4. Create a directory named "28pd" to place the .csv format data files to be labeled or predicted.
h
goemotions-binary
huggingface.co
Updated Dec 15, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alisha Walunj (2015). goemotions-binary [Dataset]. https://huggingface.co/datasets/alisha4walunj/goemotions-binary
Explore at:
Dataset updated
Dec 15, 2015
Authors
Alisha Walunj
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
alisha4walunj/goemotions-binary dataset hosted on Hugging Face and contributed by the HF Datasets community

Facebook

Twitter

Click to copy link

Link copied

Cite

Manuel Romero (2023). goemotions [Dataset]. https://huggingface.co/datasets/mrm8488/goemotions

goemotions

mrm8488/goemotions

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Aug 12, 2023

Authors

Manuel Romero

Description

GoEmotions

GoEmotions is a corpus of 58k carefully curated comments extracted from Reddit, with human annotations to 27 emotion categories or Neutral.

Number of examples: 58,009. Number of labels: 27 + Neutral. Maximum sequence length in training and evaluation datasets: 30.

On top of the raw data, we also include a version filtered based on reter-agreement, which contains a train/test/validation split:

Size of training dataset: 43,410. Size of test dataset: 5,427. Size of… See the full description on the dataset page: https://huggingface.co/datasets/mrm8488/goemotions.

Clear search

Close search

Google apps

Main menu

goemotions

goemotions

go_emotions_ptbr

GoEmotions (UA) – Emotion Classification Dataset

GoEmotions Ukrainian Dataset

Dataset Description

Translation Methodology

Dataset Statistics

Emotion Categories

File Structure

CSV Format

Example

Usage

Loading the Dataset

Multi-label Classification

With Transformers

Applications

Citation

Lice...

goemotions-5point-sentiment

Go Emotion Dataset

go-emotions-cleaned

GoEmotions

GoEmotions Dataset1

Dataset

Contents

goemotion-ekman-emotions

ru_go_emotions

Comparison of expanded experimental results.

Go Emotions: Google Emotions Dataset

en_go_emotions

Sentiment Analysis Dataset

Examples of full Label method and labels represented by integers or one-hot...

GoEmotions-neutral-sadness-joy-anger

Dataset

Contents

goemotions-ekman

Data and some code used in the paper:Expansion quantization network: A...

goemotions-binary

goemotionsSee More Versions

mrm8488/goemotions

goemotions