100+ datasets found
  1. h

    emotion

    • huggingface.co
    Updated Feb 16, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DAIR.AI (2023). emotion [Dataset]. https://huggingface.co/datasets/dair-ai/emotion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 16, 2023
    Dataset provided by
    DAIR.AI
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    Dataset Card for "emotion"

      Dataset Summary
    

    Emotion is a dataset of English Twitter messages with six basic emotions: anger, fear, joy, love, sadness, and surprise. For more detailed information please refer to the paper.

      Supported Tasks and Leaderboards
    

    More Information Needed

      Languages
    

    More Information Needed

      Dataset Structure
    
    
    
    
    
      Data Instances
    

    An example looks as follows. { "text": "im feeling quite sad and sorry for myself but… See the full description on the dataset page: https://huggingface.co/datasets/dair-ai/emotion.

  2. f

    SMILE Twitter Emotion dataset

    • figshare.com
    txt
    Updated Apr 21, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bo Wang; Adam Tsakalidis; Maria Liakata; Arkaitz Zubiaga; Rob Procter; Eric Jensen (2016). SMILE Twitter Emotion dataset [Dataset]. http://doi.org/10.6084/m9.figshare.3187909.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    Apr 21, 2016
    Dataset provided by
    figshare
    Authors
    Bo Wang; Adam Tsakalidis; Maria Liakata; Arkaitz Zubiaga; Rob Procter; Eric Jensen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is collected and annotated for the SMILE project http://www.culturesmile.org. This collection of tweets mentioning 13 Twitter handles associated with British museums was gathered between May 2013 and June 2015. It was created for the purpose of classifying emotions, expressed on Twitter towards arts and cultural experiences in museums. It contains 3,085 tweets, with 5 emotions namely anger, disgust, happiness, surprise and sadness. Please see our paper "SMILE: Twitter Emotion Classification using Domain Adaptation" for more details of the dataset.License: The annotations are provided under a CC-BY license, while Twitter retains the ownership and rights of the content of the tweets.

  3. o

    Twitter Tweets Sentiment Dataset

    • opendatabay.com
    • kaggle.com
    .csv
    Updated Jun 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Datasimple (2025). Twitter Tweets Sentiment Dataset [Dataset]. https://www.opendatabay.com/data/dataset/89d10076-3c7d-4857-8c75-0b284a9a7f06
    Explore at:
    .csvAvailable download formats
    Dataset updated
    Jun 8, 2025
    Dataset authored and provided by
    Datasimple
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Social Media and Networking
    Description

    Twitter is an online Social Media Platform where people share their their though as tweets. It is observed that some people misuse it to tweet hateful content. Twitter is trying to tackle this problem and we shall help it by creating a strong NLP based-classifier model to distinguish the negative tweets & block such tweets. Can you build a strong classifier model to predict the same?

    Each row contains the text of a tweet and a sentiment label. In the training set you are provided with a word or phrase drawn from the tweet (selected_text) that encapsulates the provided sentiment.

    Make sure, when parsing the CSV, to remove the beginning / ending quotes from the text field, to ensure that you don't include them in your training.

    You're attempting to predict the word or phrase from the tweet that exemplifies the provided sentiment. The word or phrase should include all characters within that span (i.e. including commas, spaces, etc.)

    Columns: textID - unique ID for each piece of text text - the text of the tweet sentiment - the general sentiment of the tweet Acknowledgement: The dataset is download from Kaggle Competetions:
    https://www.kaggle.com/c/tweet-sentiment-extraction/data?select=train.csv

    Objective: Understand the Dataset & cleanup (if required). Build classification models to predict the twitter sentiments. Compare the evaluation metrics of vaious classification algorithms.

    Original Data Source: Twitter Tweets Sentiment Dataset

  4. M

    Data from: COVID-19 Twitter Dataset with Latent Topics, Sentiments and...

    • catalog.midasnetwork.us
    csv, zip
    Updated Jul 12, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MIDAS Coordination Center (2023). COVID-19 Twitter Dataset with Latent Topics, Sentiments and Emotions Attributes [Dataset]. http://doi.org/10.3886/E120321
    Explore at:
    csv, zipAvailable download formats
    Dataset updated
    Jul 12, 2023
    Dataset authored and provided by
    MIDAS Coordination Center
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Variables measured
    media, disease, COVID-19, pathogen, Homo sapiens, social media, host organism, infectious disease, Severe acute respiratory syndrome coronavirus 2
    Dataset funded by
    National Institute of General Medical Sciences
    Description

    The dataset is about public conversation on Twitter surrounding the COVID-19 pandemic. They annotated seventeen latent semantic attributes for each public tweet using natural language processing techniques and machine-learning based algorithms. The latent semantic attributes include: 1) ten attributes indicating the tweet’s relevance to ten detected topics, 2) five quantitative attributes indicating the degree of intensity in the valence (i.e., unpleasantness/pleasantness) and emotional intensities across four primary emotions of fear, anger, sadness and joy, and 3) two qualitative attributes indicating the sentiment category and the most dominant emotion category, respectively. Data is accessible to people who have an OPEN ICPSR account.

  5. Z

    RANLP-Emotions-Twitter

    • data.niaid.nih.gov
    Updated Oct 14, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sanja Stajner (2021). RANLP-Emotions-Twitter [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5565044
    Explore at:
    Dataset updated
    Oct 14, 2021
    Dataset authored and provided by
    Sanja Stajner
    Description

    The RANLP-Emotions-Twitter dataset contains 210 English tweets annotated by six trained annotators for Ekman's basic emotions plus the neutral class.

    The details of the annotation procedure and various analyses can be found in [1].

    Dataset can be used only for research non-commercial purposes.

    If you use this dataset, please reference the following paper:

    [1] Štajner, S. 2021. Exploring Reliability of Gold Labels for Emotion Detection in Twitter. In Proceedings of the 13th international conference on Recent Advances in Natural Language Processing (RANLP), pp. 1350-1359.

    Bibtex reference:

    @inproceedings{stajner-2021-ranlp-emotions, title = "Exploring Reliability of Gold Labels for Emotion Detection in Twitter", author = "\v{S}tajner, Sanja", booktitle = "Proceedings of the 13th international conference on Recent Advances in Natural Language Processing (RANLP)", month = sep, year = "2021", address = "Online", pages = "1350--1359", abstract = "Emotion detection from social media posts has attracted noticeable attention from natural language processing (NLP) community in recent years. The ways for obtaining gold labels for training and testing of the systems for automatic emotion detection differ significantly from one study to another, and pose the question of reliability of gold labels and obtained classification results. This study systematically explores several ways for obtaining gold labels for Ekman's emotion model on Twitter data and the influence of the chosen strategy on the manual classification results."}

  6. c

    Sentiment Analysis Dataset

    • cubig.ai
    Updated May 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CUBIG (2025). Sentiment Analysis Dataset [Dataset]. https://cubig.ai/store/products/270/sentiment-analysis-dataset
    Explore at:
    Dataset updated
    May 20, 2025
    Dataset authored and provided by
    CUBIG
    License

    https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

    Measurement technique
    Privacy-preserving data transformation via differential privacy, Synthetic data generation using AI techniques for model training
    Description

    1) Data Introduction • The Sentiment Analysis Dataset is a dataset for emotional analysis, including large-scale tweet text collected from Twitter and emotional polarity (0=negative, 2=neutral, 4=positive) labels for each tweet, featuring automatic labeling based on emoticons.

    2) Data Utilization (1) Sentiment Analysis Dataset has characteristics that: • Each sample consists of six columns: emotional polarity, tweet ID, date of writing, search word, author, and tweet body, and is suitable for training natural language processing and classification models using tweet text and emotion labels. (2) Sentiment Analysis Dataset can be used to: • Emotional Classification Model Development: Using tweet text and emotional polarity labels, we can build positive, negative, and neutral emotional automatic classification models with various machine learning and deep learning models such as logistic regression, SVM, RNN, and LSTM. • Analysis of SNS public opinion and trends: By analyzing the distribution of emotions by time series and keywords, you can explore changes in public opinion on specific issues or brands, positive and negative trends, and key emotional keywords.

  7. P

    Twitter Sentiment Analysis Dataset

    • paperswithcode.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Twitter Sentiment Analysis Dataset [Dataset]. https://paperswithcode.com/dataset/twitter-sentiment-analysis
    Explore at:
    Description

    This is an entity-level Twitter Sentiment Analysis dataset. For each message, the task is to judge the sentiment of the entire sentence towards a given entity. For example, A outperforms B is positive for entity A but negative for entity B. The dataset contains ~70K labeled training messages and 1K labeled validation messages. It is available online for free on Kaggle.

  8. CaSET-catalan-stance-emotions-twitter

    • zenodo.org
    • huggingface.co
    zip
    Updated Jun 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo (2024). CaSET-catalan-stance-emotions-twitter [Dataset]. http://doi.org/10.57967/hf/2596
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 20, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset Summary

    The CaSET dataset is a Catalan corpus of Tweets annotated with Emotions, Static Stance, and Dynamic Stance. The dataset contains 11k unique sentences on five controversial topics, grouped in 6k pairs of sentences, paired as parent messages and replies to these messages.

    We provide the following files and folders:

    • The dataset folder contains the final dataset (an aggregation of the dynamic stance, static stance, and emotion identification tasks), as well as the dataloader and README file as provided in Hugging Face.
    • The annotations folders with one file for each of the three annotation tasks.
    • The annotation guidelines folders with a file for each task.

    Supported Tasks and Leaderboards

    This dataset can be used to train models for emotion detection, static stance detection, and dynamic stance detection.

    Languages

    The dataset is in Catalan (ca-ES).

    Dataset Structure

    Each instance in the dataset is a pair of parent-reply messages, annotated with the relation between the two messages (the dynamic stance) and the topic of the messages. For each message there is the id to retrieve it with the Twitter API, the emotions identified in the message, and the relation between the message and the topic (static stance). The text fields have to be retrieved using the Twitter API.

    Data Instances

    {
    "id_parent": "1413960970066710533",
    "id_reply": "1413968453690658816",
    "parent_text": "",
    "reply_text": "",
    "topic": "vaccines",
    "dynamic_stance": "Disagree",
    "parent_stance": "FAVOUR",
    "reply_stance": "AGAINST",
    "parent_emotion": ["distrust", "joy", "disgust"],
    "reply_emotion": ["distrust"]
    }

    Data Splits

    The dataset does not contain splits.

    Dataset Creation

    We created this corpus to contribute to the development of language models in Catalan, a low-resource language.
    Source Data

    The data was collected using the Twitter API by the Barcelona Supercomputing Center.

    Initial Data Collection and Normalization

    The data was collected based on a list of keywords related to the five topics included in the dataset: vaccines, rent regulation, surrogate pregnancy, airport expansion, and a TV show rigging. Specific periods in which the topic was under discussion were also selected.

    Who are the source language producers?

    The source language producers are users of Twitter.

    Annotations

    • Emotions are annotated in a multi-label fashion. The labels can be: Anger, Anticipation, Disgust, Fear, Joy, Sadness, Surprise, Distrust, and No emotion. CA
    • Static stance is annotated per message. The labels can be: FAVOUR, AGAINST, NEUTRAL, NA.
    • Dynamic stance is annotated per pair. The labels can be: Agree, Disagree, Elaborate, Query, Neutral, Unrelated, NA.

    Annotation process

    • For emotions there were 3 annotators. The gold labels are an aggregation of all the labels annotated by the 3. The IAA calculated with Fleiss' Kappa per label was, on average, 45.38.
    • For static stance there were 2 annotators, in the cases of disagreement a third annotated chose the gold label. The overall Fleiss' Kappa between the 2 annotators is 82.71.
    • For dynamic stance there were 4 annotators. If at least 3 of the annotators disagreed, a fifth annotator chose the gold label. The overall Fleiss' Kappa between the 4 annotators was 56.51, and the average Fleiss' Kappa of the annotators with the gold labels is 85.17.

    Who are the annotators?

    All the annotators are native speakers of Catalan.

    Social Impact of Dataset

    We hope this corpus contributes to the development of language models in Catalan, a low-resource language.

    Discussion of Biases

    We are aware that, since the data comes from social media, this will contain biases, hate speech and toxic content. We have not applied any steps to reduce their impact.

    Other Known Limitations

    The dataset has to be downloaded using the Twitter API, therefore some instances might be lost.

    Dataset Curators

    Language Technologies Unit (LangTech) at the Barcelona Supercomputing Center.

    This work has been promoted and financed by the Generalitat de Catalunya through the Aina project.

    Licensing Information

    Creative Commons Attribution 4.0.

    Citation Information

    @inproceedings{figueras-etal-2023-dynamic,
    title = "Dynamic Stance: Modeling Discussions by Labeling the Interactions",
    author = "Figueras, Blanca and
    Baucells, Irene and
    Caselli, Tommaso",
    editor = "Bouamor, Houda and
    Pino, Juan and
    Bali, Kalika",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2023",
    month = dec,
    year = "2023",
    address = "Singapore",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.findings-emnlp.432",
    doi = "10.18653/v1/2023.findings-emnlp.432",
    pages = "6503--6515",
    }

    @inproceedings{gonzalez-agirre-etal-2024-building-data,
    title = "Building a Data Infrastructure for a Mid-Resource Language: The Case of {C}atalan",
    author = "Gonzalez-Agirre, Aitor and
    Marimon, Montserrat and
    Rodriguez-Penagos, Carlos and
    Aula-Blasco, Javier and
    Baucells, Irene and
    Armentano-Oller, Carme and
    Palomar-Giner, Jorge and
    Kulebi, Baybars and
    Villegas, Marta",
    editor = "Calzolari, Nicoletta and
    Kan, Min-Yen and
    Hoste, Veronique and
    Lenci, Alessandro and
    Sakti, Sakriani and
    Xue, Nianwen",
    booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",
    month = may,
    year = "2024",
    address = "Torino, Italia",
    publisher = "ELRA and ICCL",
    url = "https://aclanthology.org/2024.lrec-main.231",
    pages = "2556--2566",
    }

    Contact information

    For further information, please send an email to langtech@bsc.es.

  9. Twitter Reviews for Emotion Analysis

    • kaggle.com
    zip
    Updated Apr 16, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shainy Merin S (2020). Twitter Reviews for Emotion Analysis [Dataset]. https://www.kaggle.com/datasets/shainy/twitter-reviews-for-emotion-analysis/metadata
    Explore at:
    zip(732190 bytes)Available download formats
    Dataset updated
    Apr 16, 2020
    Authors
    Shainy Merin S
    License

    https://www.worldbank.org/en/about/legal/terms-of-use-for-datasetshttps://www.worldbank.org/en/about/legal/terms-of-use-for-datasets

    Description

    Context

    This dataset consists of a few thousand Twitter user reviews (input text) and Emotions (output labels ) for learning how to train the text for emotion analysis. This dataset was created using Twitter API by implementing the Keywords. The idea here is a dataset is more than a toy - real business data on a reasonable scale - but can be trained in minutes on a modest laptop.

    Content

    This file has Sl no, Tweets, Search key, Feeling.

    Description of columns in the file:

    Tweets - text of the review Search key - Keyword used Feeling - Emotion classified using the keyword, this column contains 6 emotions i.e., Happy, Sad, Surprise, Fear, Disgust, Angry

    Inspiration

    This would be helpful for the organization to understand Customer reviews/feedbacks.

  10. c

    Twitter Tweets Sentiment Dataset

    • cubig.ai
    Updated Feb 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CUBIG (2025). Twitter Tweets Sentiment Dataset [Dataset]. https://cubig.ai/store/products/142/twitter-tweets-sentiment-dataset
    Explore at:
    Dataset updated
    Feb 25, 2025
    Dataset authored and provided by
    CUBIG
    License

    https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

    Measurement technique
    Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
    Description

    1) Data introduction • Twitter-tweets-sentiment dataset is a dataset that aims to analyze tweet sentiment for Twitter and natural language processing.

    2) Data utilization (1)Twitter-tweets-sentiment data has characteristics that: • The data consists of three columns, including emotion and text, and aims to block negative tweets through a powerful classification model. (2) Twitter-tweets-sentiment data can be used to: • Social Media Monitoring: Businesses and organizations can use data to monitor social media platforms and gauge public sentiment about a brand, product, event, or social issue. • Sentiment analysis: This dataset can be used to train models that classify the sentiment of tweets, which can help companies and researchers understand public opinion on a variety of topics.

  11. o

    Data from: COVID-19 Twitter Dataset with Latent Topics, Sentiments and...

    • openicpsr.org
    Updated Jul 18, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yinping Yang (2020). COVID-19 Twitter Dataset with Latent Topics, Sentiments and Emotions Attributes [Dataset]. http://doi.org/10.3886/E120321V1
    Explore at:
    Dataset updated
    Jul 18, 2020
    Dataset provided by
    Dr.
    Authors
    Yinping Yang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 28, 2020 - Jul 1, 2020
    Area covered
    Global
    Description

    We collected and processed a dataset and make it available for the research community to study the COVD-19 pandemic in multiple possibilities.

  12. i

    Twitter Sentiment Analysis Data

    • ieee-dataport.org
    Updated Aug 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rabindra Lamsal (2024). Twitter Sentiment Analysis Data [Dataset]. https://ieee-dataport.org/documents/twitter-sentiment-analysis-data
    Explore at:
    Dataset updated
    Aug 6, 2024
    Authors
    Rabindra Lamsal
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    because of COVID-19

  13. D

    Data from: Dataset: input and results related to the paper 'Anticipointment...

    • ssh.datastations.nl
    • narcis.nl
    zip
    Updated Jun 29, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    F.A. Kunneman; M.J.P. van Mulken; A.P.J. van den Bosch; F.A. Kunneman; M.J.P. van Mulken; A.P.J. van den Bosch (2017). Dataset: input and results related to the paper 'Anticipointment detection in event tweets' [Dataset]. http://doi.org/10.17026/DANS-XCP-X989
    Explore at:
    zip(17693), zip(424769593)Available download formats
    Dataset updated
    Jun 29, 2017
    Dataset provided by
    DANS Data Station Social Sciences and Humanities
    Authors
    F.A. Kunneman; M.J.P. van Mulken; A.P.J. van den Bosch; F.A. Kunneman; M.J.P. van Mulken; A.P.J. van den Bosch
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset features the training models, emotion classifications and emotion patterns before and after events, related to the paper:F. Kunneman, M. van Mulken and A. Van den Bosch, Anticipointment detection in event tweets (under review)Abstract of the study:We developed a system to detect positive expectation, disappointment, and satisfaction in tweets that refer to events automatically discovered in the Twitter stream. The emotional content shared on Twitter when referring to public events can provide insights into the presumed and experienced quality of the event. We expected to find a connection between positive expectation and disappointment, a succession that is sometimes referred to as anticipointment. The application of computational approaches makes it possible to detect the presence and strength of this hypothetical relation for a large number of events. We extracted events from a longitudinal data set of Dutch Twitter posts, and modeled classifiers to recognize emotion in the tweets related to those events by means of hashtag-labeled training data. After classifying all tweets before and after the events in our data set, we summarized the collective emotions by calculating the percentage of tweets classified with an emotion as well as ranking tweets based on the classifier confidence score for an emotion and selecting the 90th percentile. Only a weak correlation of around 0.2 was found between positive expectation and disappointment, while a higher correlation of 0.6 was found between positiveexpectation and satisfaction. The most anticipointing events were events with a clear loss, such as a canceled event or when the favored sports team had lost. We conclude that senders of Twitter posts might be more inclined to share satisfaction than disappointment after a much anticipated event.Subject period: January 1st 2011 until October 31st 2015 Date: start=2015-11-01; end=2016-02-28 (data collection)

  14. Processed twitter sentiment Dataset | Added Tokens

    • kaggle.com
    Updated Aug 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Halemo GPA (2024). Processed twitter sentiment Dataset | Added Tokens [Dataset]. http://doi.org/10.34740/kaggle/ds/5568348
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 21, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Halemo GPA
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    This dataset is a processed version of the Sentiment140 corpus, containing 1.6 million tweets with binary sentiment labels. The original data has been cleaned, tokenized, and prepared for natural language processing (NLP) and machine learning tasks. It provides a rich resource for sentiment analysis, text classification, and other NLP applications. The dataset includes the full processed corpus (train-processed.csv) and a smaller sample of 10,000 tweets (train-processed-sample.csv) for quick experimentation and model prototyping. Key Features:

    1.6 million labeled tweets Binary sentiment classification (0 for negative, 1 for positive) Preprocessed and tokenized text Balanced class distribution Suitable for various NLP tasks and model architectures

    Citation If you use this dataset in your research or project, please cite the original Sentiment140 dataset: Go, A., Bhayani, R. and Huang, L., 2009. Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford, 1(2009), p.12.

  15. h

    large-twitter-tweets-sentiment

    • huggingface.co
    Updated Mar 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gong Xiangbo (2024). large-twitter-tweets-sentiment [Dataset]. https://huggingface.co/datasets/gxb912/large-twitter-tweets-sentiment
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 6, 2024
    Authors
    Gong Xiangbo
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Card for "Large twitter tweets sentiment analysis"

      Dataset Description
    
    
    
    
    
      Dataset Summary
    

    This dataset is a collection of tweets formatted in a tabular data structure, annotated for sentiment analysis. Each tweet is associated with a sentiment label, with 1 indicating a Positive sentiment and 0 for a Negative sentiment.

      Languages
    

    The tweets in English.

      Dataset Structure
    
    
    
    
    
      Data Instances
    

    An instance of the dataset includes… See the full description on the dataset page: https://huggingface.co/datasets/gxb912/large-twitter-tweets-sentiment.

  16. m

    Twitter Sentiments Dataset

    • data.mendeley.com
    Updated May 14, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SHERIF HUSSEIN (2021). Twitter Sentiments Dataset [Dataset]. http://doi.org/10.17632/z9zw7nt5h2.1
    Explore at:
    Dataset updated
    May 14, 2021
    Authors
    SHERIF HUSSEIN
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset has three sentiments namely, negative, neutral, and positive. It contains two fields for the tweet and label.

  17. f

    Twitter dataset

    • figshare.com
    csv
    Updated Feb 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shreyas Poojary; Mohammed Riza; Rashmi Laxmikant Malghan (2025). Twitter dataset [Dataset]. http://doi.org/10.6084/m9.figshare.28390334.v2
    Explore at:
    csvAvailable download formats
    Dataset updated
    Feb 11, 2025
    Dataset provided by
    figshare
    Authors
    Shreyas Poojary; Mohammed Riza; Rashmi Laxmikant Malghan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains tweets labeled for sentiment analysis, categorized into Positive, Negative, and Neutral sentiments. The dataset includes tweet IDs, user metadata, sentiment labels, and tweet text, making it suitable for Natural Language Processing (NLP), machine learning, and AI-based sentiment classification research. Originally sourced from Kaggle, this dataset is curated for improved usability in social media sentiment analysis.

  18. P

    Data from: JTES Dataset

    • paperswithcode.com
    Updated May 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). JTES Dataset [Dataset]. https://paperswithcode.com/dataset/jtes
    Explore at:
    Dataset updated
    May 17, 2024
    Description

    We designed an emotional speech database that can be used for emotion recognition as well as recognition and synthesis of speech with various emotions. The database was designed by compiling tweets acquired from Twitter and selecting emotion- dependent tweets considering phonetic and prosodic balance. We classified gathered tweets into four emotions: joy, anger, sadness and neutral, and then selected 50 sentences from sentences of each emotion based on the entropy-based algorithm. We compared the selected sentence sets with randomly selected sentence sets from aspects of phonetic and prosodic balance and sentence length, and confirmed that the sets selected by the algorithm were more balanced. Next, we recorded emotional speech based on the selected sentences. Then, we evaluated the speech from the viewpoint of emotional recognition and emotional speech recognition.

  19. h

    twitter-financial-news-sentiment

    • huggingface.co
    • opendatalab.com
    Updated Dec 4, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    not a (2022). twitter-financial-news-sentiment [Dataset]. https://huggingface.co/datasets/zeroshot/twitter-financial-news-sentiment
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 4, 2022
    Authors
    not a
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Description

    The Twitter Financial News dataset is an English-language dataset containing an annotated corpus of finance-related tweets. This dataset is used to classify finance-related tweets for their sentiment.

    The dataset holds 11,932 documents annotated with 3 labels:

    sentiments = { "LABEL_0": "Bearish", "LABEL_1": "Bullish", "LABEL_2": "Neutral" }

    The data was collected using the Twitter API. The current dataset supports the multi-class classification… See the full description on the dataset page: https://huggingface.co/datasets/zeroshot/twitter-financial-news-sentiment.

  20. Z

    Brussel mobility Twitter sentiment analysis CSV Dataset

    • data.niaid.nih.gov
    Updated May 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    van Vessem, Charlotte (2024). Brussel mobility Twitter sentiment analysis CSV Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_11401123
    Explore at:
    Dataset updated
    May 31, 2024
    Dataset provided by
    Ginis, Vincent
    Betancur Arenas, Juliana
    Tori, Floriano
    van Vessem, Charlotte
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Brussels
    Description

    SSH CENTRE (Social Sciences and Humanities for Climate, Energy aNd Transport Research Excellence) is a Horizon Europe project, engaging directly with stakeholders across research, policy, and business (including citizens) to strengthen social innovation, SSH-STEM collaboration, transdisciplinary policy advice, inclusive engagement, and SSH communities across Europe, accelerating the EU’s transition to carbon neutrality. SSH CENTRE is based in a range of activities related to Open Science, inclusivity and diversity – especially with regards Southern and Eastern Europe and different career stages – including: development of novel SSH-STEM collaborations to facilitate the delivery of the EU Green Deal; SSH knowledge brokerage to support regions in transition; and the effective design of strategies for citizen engagement in EU R&I activities. Outputs include action-led agendas and building stakeholder synergies through regular Policy Insight events.This is captured in a high-profile virtual SSH CENTRE generating and sharing best practice for SSH policy advice, overcoming fragmentation to accelerate the EU’s journey to a sustainable future.The documents uploaded here are part of WP2 whereby novel, interdisciplinary teams were provided funding to undertake activities to develop a policy recommendation related to EU Green Deal policy. Each of these policy recommendations, and the activities that inform them, will be written-up as a chapter in an edited book collection. Three books will make up this edited collection - one on climate, one on energy and one on mobility. As part of writing a chapter for the SSH CENTRE book on ‘Mobility’, we set out to analyse the sentiment of users on Twitter regarding shared and active mobility modes in Brussels. This involved us collecting tweets between 2017-2022. A tweet was collected if it contained a previously defined mobility keyword (for example: metro) and either the name of a (local) politician, a neighbourhood or municipality, or a (shared) mobility provider. The files attached to this Zenodo webpage is a csv files containing the tweets collected.”.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
DAIR.AI (2023). emotion [Dataset]. https://huggingface.co/datasets/dair-ai/emotion

emotion

Emotion

dair-ai/emotion

Explore at:
20 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 16, 2023
Dataset provided by
DAIR.AI
License

https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

Description

Dataset Card for "emotion"

  Dataset Summary

Emotion is a dataset of English Twitter messages with six basic emotions: anger, fear, joy, love, sadness, and surprise. For more detailed information please refer to the paper.

  Supported Tasks and Leaderboards

More Information Needed

  Languages

More Information Needed

  Dataset Structure





  Data Instances

An example looks as follows. { "text": "im feeling quite sad and sorry for myself but… See the full description on the dataset page: https://huggingface.co/datasets/dair-ai/emotion.

Search
Clear search
Close search
Google apps
Main menu