5 datasets found

Twitter Emoji Prediction
kaggle.com
Updated Feb 10, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
HariAS (2019). Twitter Emoji Prediction [Dataset]. https://www.kaggle.com/hariharasudhanas/twitter-emoji-prediction/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 10, 2019
Dataset provided by
Kaggle
Authors
HariAS
Description
Content

Train.csv contains tweets and labels are emojis. You can find the emoji-label mapping in Mapping.csv. Predict emoji's to use for the test set.

Approaches

Best method among those tried was Bi-directional LSTM with Glove embeddings (42B)

License

Belongs to the original author on Twitter
Z
Italian Tweet Embeddings Used For Emoji Prediction
data.niaid.nih.gov
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andrei Catalin Coman (2020). Italian Tweet Embeddings Used For Emoji Prediction [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1467219
Explore at:
Dataset updated
Jan 24, 2020
Dataset provided by
Yaroslav Nechaev
Giacomo Zara
Andrei Catalin Coman
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains 100d word embeddings trained on 48M Italian tweets using fastText and employed by our team to predict emojis during ITAmoji competition of EVALITA 2018 Evaluation Campaign.
O
Data from: Multimodal Emoji Prediction
opendatalab.com
zip
Updated Sep 22, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
IBM T. J. Watson Research Center USA (2022). Multimodal Emoji Prediction [Dataset]. https://opendatalab.com/OpenDataLab/Multimodal_Emoji_Prediction
Explore at:
zip(24140764 bytes)Available download formats
Dataset updated
Sep 22, 2022
Dataset provided by
IBM T. J. Watson Research Center USA
Universitat Pompeu Fabra
TALN
Description
The twitter emoji dataset obtained from CodaLab comprises of 50 thousand tweets along with the associated emoji label. Each tweet in the dataset has a corresponding numerical label which maps to a specific emoji. The emojis are of the 20 most frequent emojis and hence the labels range from 0 to 19
f
Data from: Twitter corpus of Resource-Scarce Languages for Sentiment...
figshare.com
zip
Updated Jun 12, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rajat Singh; Nurendra Choudhary (2018). Twitter corpus of Resource-Scarce Languages for Sentiment Analysis and Multilingual Emoji Prediction [Dataset]. http://doi.org/10.6084/m9.figshare.6477782.v6
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.6477782.v6
Dataset updated
Jun 12, 2018
Dataset provided by
figshare
Authors
Rajat Singh; Nurendra Choudhary
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is created by leveraging the social media platforms such as twitter for developing corpus across multiple languages. The corpus creation methodology is applicable for resource-scarce languages provided the speakers of that particular language are active users on social media platforms. We present an approach to extract social media microblogs such as tweets (Twitter). We created corpus for multilingual sentiment analysis and emoji prediction in Hindi, Bengali and Telugu. Further, we perform and analyze multiple NLP tasks utilizing the corpus to get interesting observations.
f
Data_Sheet_1_COVID-19 case prediction using emotion trends via Twitter emoji...
frontiersin.figshare.com
pdf
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vu Tran; Tomoko Matsui (2023). Data_Sheet_1_COVID-19 case prediction using emotion trends via Twitter emoji analysis: A case study in Japan.pdf [Dataset]. http://doi.org/10.3389/fpubh.2023.1079315.s001
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.3389/fpubh.2023.1079315.s001
Dataset updated
May 31, 2023
Dataset provided by
Frontiers
Authors
Vu Tran; Tomoko Matsui
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Japan
Description
IntroductionThe worldwide COVID-19 pandemic, which began in December 2019 and has lasted for almost 3 years now, has undergone many changes and has changed public perceptions and attitudes. Various systems for predicting the progression of the pandemic have been developed to help assess the risk of COVID-19 spreading. In a case study in Japan, we attempt to determine whether the trend of emotions toward COVID-19 expressed on social media, specifically Twitter, can be used to enhance COVID-19 case prediction system performance.MethodsWe use emoji as a proxy to shallowly capture the trend in emotion expression on Twitter. Two aspects of emoji are studied: the surface trend in emoji usage by using the tweet count and the structural interaction of emoji by using an anomalous score.ResultsOur experimental results show that utilizing emoji improved system performance in the majority of evaluations.
Not seeing a result you expected?
Learn how you can add new datasets to our index.