5 datasets found
  1. Twitter Emoji Prediction

    • kaggle.com
    Updated Feb 10, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    HariAS (2019). Twitter Emoji Prediction [Dataset]. https://www.kaggle.com/hariharasudhanas/twitter-emoji-prediction/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 10, 2019
    Dataset provided by
    Kaggle
    Authors
    HariAS
    Description

    Content

    Train.csv contains tweets and labels are emojis. You can find the emoji-label mapping in Mapping.csv. Predict emoji's to use for the test set.

    Approaches

    Best method among those tried was Bi-directional LSTM with Glove embeddings (42B)

    License

    Belongs to the original author on Twitter

  2. Z

    Italian Tweet Embeddings Used For Emoji Prediction

    • data.niaid.nih.gov
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrei Catalin Coman (2020). Italian Tweet Embeddings Used For Emoji Prediction [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1467219
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Yaroslav Nechaev
    Giacomo Zara
    Andrei Catalin Coman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains 100d word embeddings trained on 48M Italian tweets using fastText and employed by our team to predict emojis during ITAmoji competition of EVALITA 2018 Evaluation Campaign.

  3. O

    Data from: Multimodal Emoji Prediction

    • opendatalab.com
    zip
    Updated Sep 22, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IBM T. J. Watson Research Center USA (2022). Multimodal Emoji Prediction [Dataset]. https://opendatalab.com/OpenDataLab/Multimodal_Emoji_Prediction
    Explore at:
    zip(24140764 bytes)Available download formats
    Dataset updated
    Sep 22, 2022
    Dataset provided by
    IBM T. J. Watson Research Center USA
    Universitat Pompeu Fabra
    TALN
    Description

    The twitter emoji dataset obtained from CodaLab comprises of 50 thousand tweets along with the associated emoji label. Each tweet in the dataset has a corresponding numerical label which maps to a specific emoji. The emojis are of the 20 most frequent emojis and hence the labels range from 0 to 19

  4. f

    Data from: Twitter corpus of Resource-Scarce Languages for Sentiment...

    • figshare.com
    zip
    Updated Jun 12, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rajat Singh; Nurendra Choudhary (2018). Twitter corpus of Resource-Scarce Languages for Sentiment Analysis and Multilingual Emoji Prediction [Dataset]. http://doi.org/10.6084/m9.figshare.6477782.v6
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 12, 2018
    Dataset provided by
    figshare
    Authors
    Rajat Singh; Nurendra Choudhary
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is created by leveraging the social media platforms such as twitter for developing corpus across multiple languages. The corpus creation methodology is applicable for resource-scarce languages provided the speakers of that particular language are active users on social media platforms. We present an approach to extract social media microblogs such as tweets (Twitter). We created corpus for multilingual sentiment analysis and emoji prediction in Hindi, Bengali and Telugu. Further, we perform and analyze multiple NLP tasks utilizing the corpus to get interesting observations.

  5. f

    Data_Sheet_1_COVID-19 case prediction using emotion trends via Twitter emoji...

    • frontiersin.figshare.com
    pdf
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vu Tran; Tomoko Matsui (2023). Data_Sheet_1_COVID-19 case prediction using emotion trends via Twitter emoji analysis: A case study in Japan.pdf [Dataset]. http://doi.org/10.3389/fpubh.2023.1079315.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Frontiers
    Authors
    Vu Tran; Tomoko Matsui
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Japan
    Description

    IntroductionThe worldwide COVID-19 pandemic, which began in December 2019 and has lasted for almost 3 years now, has undergone many changes and has changed public perceptions and attitudes. Various systems for predicting the progression of the pandemic have been developed to help assess the risk of COVID-19 spreading. In a case study in Japan, we attempt to determine whether the trend of emotions toward COVID-19 expressed on social media, specifically Twitter, can be used to enhance COVID-19 case prediction system performance.MethodsWe use emoji as a proxy to shallowly capture the trend in emotion expression on Twitter. Two aspects of emoji are studied: the surface trend in emoji usage by using the tweet count and the structural interaction of emoji by using an anomalous score.ResultsOur experimental results show that utilizing emoji improved system performance in the majority of evaluations.

  6. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
HariAS (2019). Twitter Emoji Prediction [Dataset]. https://www.kaggle.com/hariharasudhanas/twitter-emoji-prediction/code
Organization logo

Twitter Emoji Prediction

Predict relevant emoji to use given the tweet

Explore at:
13 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 10, 2019
Dataset provided by
Kaggle
Authors
HariAS
Description

Content

Train.csv contains tweets and labels are emojis. You can find the emoji-label mapping in Mapping.csv. Predict emoji's to use for the test set.

Approaches

Best method among those tried was Bi-directional LSTM with Glove embeddings (42B)

License

Belongs to the original author on Twitter

Search
Clear search
Close search
Google apps
Main menu