100+ datasets found

m
Twitter Sentiments Dataset
data.mendeley.com
Updated May 14, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SHERIF HUSSEIN (2021). Twitter Sentiments Dataset [Dataset]. http://doi.org/10.17632/z9zw7nt5h2.1
Explore at:
Unique identifier
https://doi.org/10.17632/z9zw7nt5h2.1
Dataset updated
May 14, 2021
Authors
SHERIF HUSSEIN
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset has three sentiments namely, negative, neutral, and positive. It contains two fields for the tweet and label.
Twitter dataset
figshare.com
csv
Updated Feb 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shreyas Poojary; Mohammed Riza; Rashmi Laxmikant Malghan (2025). Twitter dataset [Dataset]. http://doi.org/10.6084/m9.figshare.28390334.v2
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.28390334.v2
Dataset updated
Feb 11, 2025
Dataset provided by
figshare
Authors
Shreyas Poojary; Mohammed Riza; Rashmi Laxmikant Malghan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains tweets labeled for sentiment analysis, categorized into Positive, Negative, and Neutral sentiments. The dataset includes tweet IDs, user metadata, sentiment labels, and tweet text, making it suitable for Natural Language Processing (NLP), machine learning, and AI-based sentiment classification research. Originally sourced from Kaggle, this dataset is curated for improved usability in social media sentiment analysis.
h
financial-tweets-sentiment
huggingface.co
Updated Dec 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tim Koornstra (2023). financial-tweets-sentiment [Dataset]. https://huggingface.co/datasets/TimKoornstra/financial-tweets-sentiment
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 15, 2023
Authors
Tim Koornstra
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Financial Sentiment Analysis Dataset

Overview

This dataset is a comprehensive collection of tweets focused on financial topics, meticulously curated to assist in sentiment analysis in the domain of finance and stock markets. It serves as a valuable resource for training machine learning models to understand and predict sentiment trends based on social media discourse, particularly within the financial sector.

Data Description

The dataset comprises tweets… See the full description on the dataset page: https://huggingface.co/datasets/TimKoornstra/financial-tweets-sentiment.
h
twitter_posts
huggingface.co
Updated Mar 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Christian Laviolette (2024). twitter_posts [Dataset]. https://huggingface.co/datasets/claviole/twitter_posts
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 6, 2024
Authors
Christian Laviolette
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset Card for "Large twitter tweets sentiment analysis"

Dataset Description Dataset Summary

This dataset is a collection of tweets formatted in a tabular data structure, annotated for sentiment analysis. Each tweet is associated with a sentiment label, with 1 indicating a Positive sentiment and 0 for a Negative sentiment.

Languages

The tweets in English.

Dataset Structure Data Instances

An instance of the dataset includes… See the full description on the dataset page: https://huggingface.co/datasets/claviole/twitter_posts.
Z
Brussel mobility Twitter sentiment analysis CSV Dataset
data.niaid.nih.gov
Updated May 31, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
van Vessem, Charlotte (2024). Brussel mobility Twitter sentiment analysis CSV Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_11401123
Explore at:
Dataset updated
May 31, 2024
Dataset provided by
Ginis, Vincent
Tori, Floriano
van Vessem, Charlotte
Betancur Arenas, Juliana
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Brussels
Description
SSH CENTRE (Social Sciences and Humanities for Climate, Energy aNd Transport Research Excellence) is a Horizon Europe project, engaging directly with stakeholders across research, policy, and business (including citizens) to strengthen social innovation, SSH-STEM collaboration, transdisciplinary policy advice, inclusive engagement, and SSH communities across Europe, accelerating the EU’s transition to carbon neutrality. SSH CENTRE is based in a range of activities related to Open Science, inclusivity and diversity – especially with regards Southern and Eastern Europe and different career stages – including: development of novel SSH-STEM collaborations to facilitate the delivery of the EU Green Deal; SSH knowledge brokerage to support regions in transition; and the effective design of strategies for citizen engagement in EU R&I activities. Outputs include action-led agendas and building stakeholder synergies through regular Policy Insight events.This is captured in a high-profile virtual SSH CENTRE generating and sharing best practice for SSH policy advice, overcoming fragmentation to accelerate the EU’s journey to a sustainable future.The documents uploaded here are part of WP2 whereby novel, interdisciplinary teams were provided funding to undertake activities to develop a policy recommendation related to EU Green Deal policy. Each of these policy recommendations, and the activities that inform them, will be written-up as a chapter in an edited book collection. Three books will make up this edited collection - one on climate, one on energy and one on mobility. As part of writing a chapter for the SSH CENTRE book on ‘Mobility’, we set out to analyse the sentiment of users on Twitter regarding shared and active mobility modes in Brussels. This involved us collecting tweets between 2017-2022. A tweet was collected if it contained a previously defined mobility keyword (for example: metro) and either the name of a (local) politician, a neighbourhood or municipality, or a (shared) mobility provider. The files attached to this Zenodo webpage is a csv files containing the tweets collected.”.

Twitter Sentiment Analysis Datasets

brightdata.com

.json, .csv, .xlsx

Updated Sep 5, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

Bright Data (2025). Twitter Sentiment Analysis Datasets [Dataset]. https://brightdata.com/products/datasets/twitter/sentiment-analysis

Explore at:

.json, .csv, .xlsxAvailable download formats

Dataset updated

Sep 5, 2025

Dataset authored and provided by

Bright Datahttps://brightdata.com/

License

https://brightdata.com/licensehttps://brightdata.com/license

Area covered

Worldwide

Description

Our Twitter Sentiment Analysis Dataset provides a comprehensive collection of tweets, enabling businesses, researchers, and analysts to assess public sentiment, track trends, and monitor brand perception in real time. This dataset includes detailed metadata for each tweet, allowing for in-depth analysis of user engagement, sentiment trends, and social media impact.

Key Features:

  Tweet Content & Metadata: Includes tweet text, hashtags, mentions, media attachments, and engagement metrics such as likes, retweets, and replies.
  Sentiment Classification: Analyze sentiment polarity (positive, negative, neutral) to gauge public opinion on brands, events, and trending topics.
  Author & User Insights: Access user details such as username, profile information, follower count, and account verification status.
  Hashtag & Topic Tracking: Identify trending hashtags and keywords to monitor conversations and sentiment shifts over time.
  Engagement Metrics: Measure tweet performance based on likes, shares, and comments to evaluate audience interaction.
  Historical & Real-Time Data: Choose from historical datasets for trend analysis or real-time data for up-to-date sentiment tracking.


Use Cases:

  Brand Monitoring & Reputation Management: Track public sentiment around brands, products, and services to manage reputation and customer perception.
  Market Research & Consumer Insights: Analyze consumer opinions on industry trends, competitor performance, and emerging market opportunities.
  Political & Social Sentiment Analysis: Evaluate public opinion on political events, social movements, and global issues.
  AI & Machine Learning Applications: Train sentiment analysis models for natural language processing (NLP) and predictive analytics.
  Advertising & Campaign Performance: Measure the effectiveness of marketing campaigns by analyzing audience engagement and sentiment.



  Our dataset is available in multiple formats (JSON, CSV, Excel) and can be delivered via API, cloud storage (AWS, Google Cloud, Azure), or direct download. 
  Gain valuable insights into social media sentiment and enhance your decision-making with high-quality, structured Twitter data.

m
Dataset for twitter Sentiment Analysis using Roberta and Vader
data.mendeley.com
Updated May 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jannatul Ferdoshi Jannatul Ferdoshi (2023). Dataset for twitter Sentiment Analysis using Roberta and Vader [Dataset]. http://doi.org/10.17632/2sjt22sb55.1
Explore at:
Unique identifier
https://doi.org/10.17632/2sjt22sb55.1
Dataset updated
May 14, 2023
Authors
Jannatul Ferdoshi Jannatul Ferdoshi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Our dataset comprises 1000 tweets, which were taken from Twitter using the Python programming language. The dataset was stored in a CSV file and generated using various modules. The random module was used to generate random IDs and text, while the faker module was used to generate random user names and dates. Additionally, the textblob module was used to assign a random sentiment to each tweet.

This systematic approach ensures that the dataset is well-balanced and represents different types of tweets, user behavior, and sentiment. It is essential to have a balanced dataset to ensure that the analysis and visualization of the dataset are accurate and reliable. By generating tweets with a range of sentiments, we have created a diverse dataset that can be used to analyze and visualize sentiment trends and patterns.

In addition to generating the tweets, we have also prepared a visual representation of the data sets. This visualization provides an overview of the key features of the dataset, such as the frequency distribution of the different sentiment categories, the distribution of tweets over time, and the user names associated with the tweets. This visualization will aid in the initial exploration of the dataset and enable us to identify any patterns or trends that may be present.
Twitter Sentiment Analysis - 1M data
kaggle.com
Updated Mar 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Amirhossein Ahmadnejad (2023). Twitter Sentiment Analysis - 1M data [Dataset]. https://www.kaggle.com/datasets/amirhoseinahmadnejad/twitter-sentiment-analysis-1m-data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 30, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Amirhossein Ahmadnejad
Description
this dataset is a combination of over 6 different datasets found on Kaggle. the labels are 0 and 1 which means negative and positive tweets. in the cleared dataset I delete mentions. you can do any preprocessing you want on the dataset. I will appreciate any notebooks submitted on this dataset to help others with sentiment analysis tasks. I will submit mine as well.
Turkish Tweets Dataset
kaggle.com
Updated Apr 9, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anil Guven (2021). Turkish Tweets Dataset [Dataset]. https://www.kaggle.com/datasets/anil1055/turkish-tweet-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 9, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Anil Guven
Description
Dataset consists of 5 emotion labels. These labels are anger, happy, distinguish, surprise and fear. There are 800 tweets in the dataset for each label. Hence, total tweet count is 4000 for dataset.

You can use the data set in many areas such as sentiment, emotion analysis and topic modeling.

Info: Hashtags and usernames was removed in the dataset. Dataset has used many studies and researches. These researches are followed as: -(please citation this article) Güven, Z. A., Diri, B., & Cąkaloglu, T. (2020). Comparison of n-stage Latent Dirichlet Allocation versus other topic modeling methods for emotion analysis. Journal of the Faculty of Engineering and Architecture of Gazi University. https://doi.org/10.17341/gazimmfd.556104 -Güven, Z. A., Diri, B., & Çakaloğlu, T. (2019). Emotion Detection with n-stage Latent Dirichlet Allocation for Turkish Tweets. Academic Platform Journal of Engineering and Science. https://doi.org/10.21541/apjes.459447 -Guven, Z. A., Diri, B., & Cakaloglu, T. (2019). Comparison Method for Emotion Detection of Twitter Users. Proceedings - 2019 Innovations in Intelligent Systems and Applications Conference, ASYU 2019. https://doi.org/10.1109/ASYU48272.2019.8946435
c
Sentiment Analysis Dataset
cubig.ai
Updated May 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CUBIG (2025). Sentiment Analysis Dataset [Dataset]. https://cubig.ai/store/products/270/sentiment-analysis-dataset
Explore at:
Dataset updated
May 20, 2025
Dataset authored and provided by
CUBIG
License
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
Measurement technique
Privacy-preserving data transformation via differential privacy, Synthetic data generation using AI techniques for model training
Description
1) Data Introduction • The Sentiment Analysis Dataset is a dataset for emotional analysis, including large-scale tweet text collected from Twitter and emotional polarity (0=negative, 2=neutral, 4=positive) labels for each tweet, featuring automatic labeling based on emoticons.

2) Data Utilization (1) Sentiment Analysis Dataset has characteristics that: • Each sample consists of six columns: emotional polarity, tweet ID, date of writing, search word, author, and tweet body, and is suitable for training natural language processing and classification models using tweet text and emotion labels. (2) Sentiment Analysis Dataset can be used to: • Emotional Classification Model Development: Using tweet text and emotional polarity labels, we can build positive, negative, and neutral emotional automatic classification models with various machine learning and deep learning models such as logistic regression, SVM, RNN, and LSTM. • Analysis of SNS public opinion and trends: By analyzing the distribution of emotions by time series and keywords, you can explore changes in public opinion on specific issues or brands, positive and negative trends, and key emotional keywords.
h
sentiment-analysis-tweet
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
LYT, sentiment-analysis-tweet [Dataset]. https://huggingface.co/datasets/LYTinn/sentiment-analysis-tweet
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Authors
LYT
Description
LYTinn/sentiment-analysis-tweet dataset hosted on Hugging Face and contributed by the HF Datasets community
h
Swahili-tweet-sentiment
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Davis David, Swahili-tweet-sentiment [Dataset]. https://huggingface.co/datasets/Davis/Swahili-tweet-sentiment
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Authors
Davis David
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
A new Swahili tweet dataset for sentiment analysis.

Issues ⚠️

Incase you have any difficulties or issues while trying to run the script you can raise it on the issues section.

Pull Requests 🔧

If you have something to add or new idea to implement, you are welcome to create a pull requests on improvement.

Give it a Like 👍

If you find this dataset useful, give it a like so as many people can get to know it.

Credits

All the credits to Davis David… See the full description on the dataset page: https://huggingface.co/datasets/Davis/Swahili-tweet-sentiment.
E
Slovene tweet sentiment analysis
live.european-language-grid.eu
Updated Jul 8, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Faculty of Computer and Information Science, University of Ljubljana (2021). Slovene tweet sentiment analysis [Dataset]. https://live.european-language-grid.eu/catalogue/tool-service/9211
Explore at:
Dataset updated
Jul 8, 2021
Dataset authored and provided by
Faculty of Computer and Information Science, University of Ljubljana
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Sentiment classifier for Slovene language, trained on labeled Slovenian tweets (http://hdl.handle.net/11356/1054), using SloBERTa language model. The tool, classifies sentences (or short paragraphs) into three predefined classes (positive, negative, or neutral), based on the sentiment/stance of the input text.
Twitter Sentiment Analysis
kaggle.com
Updated Sep 30, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shanks0465 (2020). Twitter Sentiment Analysis [Dataset]. https://www.kaggle.com/shanks0465/twitter-sentiment-analysis/activity
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 30, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Shanks0465
Description
Context

Twitter Sentiment Analysis Dataset especially for classification using Logistic Regression

Content

tweet - Preprocessed token array for each tweet (Preprocessing done are remove hyperlinks, remove hashtags, remove stop words and punctuation)

bias - Just a simple bias value (default 1)

pos - Sum of positive frequencies of each word in the tweet tokens.

neg - Sum of negative frequencies of each word in the tweet tokens.

label - 1.0 for Positive Tweet and 0.0 for Negative Tweet.

Acknowledgements

This dataset was part of the Week 1 Labs of Coursera Natural Language Processing Course. This dataset was custom created from scratch using NLTL Library for text preprocessing and all functions for preprocessing were from scratch.
t
Twitter Sentiment Analysis Dataset - Dataset - LDM
service.tib.eu
Updated Nov 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Twitter Sentiment Analysis Dataset - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/twitter-sentiment-analysis-dataset
Explore at:
Dataset updated
Nov 25, 2024
Description
The dataset comprises tweets labeled with sentiment ratings in an ordinal five-point scale, including classes for strongly negative, negative, neutral, positive, and strongly positive.
i
Coronavirus (COVID-19) Tweets Sentiment Trend
ieee-dataport.org
Updated Nov 4, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rabindra Lamsal (2022). Coronavirus (COVID-19) Tweets Sentiment Trend [Dataset]. https://ieee-dataport.org/open-access/coronavirus-covid-19-tweets-sentiment-trend
Explore at:
Dataset updated
Nov 4, 2022
Authors
Rabindra Lamsal
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset gives a cursory glimpse at the overall sentiment trend of the public discourse regarding the COVID-19 pandemic on Twitter. The live scatter plot of this dataset is available as The Overall Trend block at https://live.rlamsal.com.np. The trend graph reveals multiple peaks and drops that need further analysis. The n-grams during those peaks and drops can prove beneficial for better understanding the discourse.
Tweets Dataset
brightdata.com
.json, .csv, .xlsx
Updated Nov 13, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bright Data (2024). Tweets Dataset [Dataset]. https://brightdata.com/products/datasets/twitter/tweets
Explore at:
.json, .csv, .xlsxAvailable download formats
Dataset updated
Nov 13, 2024
Dataset authored and provided by
Bright Datahttps://brightdata.com/
License
https://brightdata.com/licensehttps://brightdata.com/license
Area covered
Worldwide
Description
Utilize our Tweets dataset for a range of applications to enhance business strategies and market insights. Analyzing this dataset offers a comprehensive view of social media dynamics, empowering organizations to optimize their communication and marketing strategies. Access the full dataset or select specific data points tailored to your needs. Popular use cases include sentiment analysis to gauge public opinion and brand perception, competitor analysis by examining engagement and sentiment around rival brands, and crisis management through real-time tracking of tweet sentiment and influential voices during critical events.
Z
Data from: IA Tweets Analysis Dataset (Spanish)
data.niaid.nih.gov
Updated Aug 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Serrano-Fernández, Alejandro (2024). IA Tweets Analysis Dataset (Spanish) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10821484
Explore at:
Dataset updated
Aug 3, 2024
Dataset provided by
Guerrero-Contreras, Gabriel
Balderas-Díaz, Sara
Muñoz, Andrés
Serrano-Fernández, Alejandro
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
General Description

This dataset comprises 4,038 tweets in Spanish, related to discussions about artificial intelligence (AI), and was created and utilized in the publication "Enhancing Sentiment Analysis on Social Media: Integrating Text and Metadata for Refined Insights," (10.1109/IE61493.2024.10599899) presented at the 20th International Conference on Intelligent Environments. It is designed to support research on public perception, sentiment, and engagement with AI topics on social media from a Spanish-speaking perspective. Each entry includes detailed annotations covering sentiment analysis, user engagement metrics, and user profile characteristics, among others.

Data Collection Method

Tweets were gathered through the Twitter API v1.1 by targeting keywords and hashtags associated with artificial intelligence, focusing specifically on content in Spanish. The dataset captures a wide array of discussions, offering a holistic view of the Spanish-speaking public's sentiment towards AI.

Dataset Content

ID: A unique identifier for each tweet.

text: The textual content of the tweet. It is a string with a maximum allowed length of 280 characters.

polarity: The tweet's sentiment polarity (e.g., Positive, Negative, Neutral).

favorite_count: Indicates how many times the tweet has been liked by Twitter users. It is a non-negative integer.

retweet_count: The number of times this tweet has been retweeted. It is a non-negative integer.

user_verified: When true, indicates that the user has a verified account, which helps the public recognize the authenticity of accounts of public interest. It is a boolean data type with two allowed values: True or False.

user_default_profile: When true, indicates that the user has not altered the theme or background of their user profile. It is a boolean data type with two allowed values: True or False.

user_has_extended_profile: When true, indicates that the user has an extended profile. An extended profile on Twitter allows users to provide more detailed information about themselves, such as an extended biography, a header image, details about their location, website, and other additional data. It is a boolean data type with two allowed values: True or False.

user_followers_count: The current number of followers the account has. It is a non-negative integer.

user_friends_count: The number of users that the account is following. It is a non-negative integer.

user_favourites_count: The number of tweets this user has liked since the account was created. It is a non-negative integer.

user_statuses_count: The number of tweets (including retweets) posted by the user. It is a non-negative integer.

user_protected: When true, indicates that this user has chosen to protect their tweets, meaning their tweets are not publicly visible without their permission. It is a boolean data type with two allowed values: True or False.

user_is_translator: When true, indicates that the user posting the tweet is a verified translator on Twitter. This means they have been recognized and validated by the platform as translators of content in different languages. It is a boolean data type with two allowed values: True or False.

Cite as

Guerrero-Contreras, G., Balderas-Díaz, S., Serrano-Fernández, A., & Muñoz, A. (2024, June). Enhancing Sentiment Analysis on Social Media: Integrating Text and Metadata for Refined Insights. In 2024 International Conference on Intelligent Environments (IE) (pp. 62-69). IEEE.

Potential Use Cases

This dataset is aimed at academic researchers and practitioners with interests in:

Sentiment analysis and natural language processing (NLP) with a focus on AI discussions in the Spanish language.

Social media analysis on public engagement and perception of artificial intelligence among Spanish speakers.

Exploring correlations between user engagement metrics and sentiment in discussions about AI.

Data Format and File Type

The dataset is provided in CSV format, ensuring compatibility with a wide range of data analysis tools and programming environments.

License

The dataset is available under the Creative Commons Attribution 4.0 International (CC BY 4.0) license, permitting sharing, copying, distribution, transmission, and adaptation of the work for any purpose, including commercial, provided proper attribution is given.
h
AfriSenti-Twitter
huggingface.co
opendatalab.com
Updated Feb 19, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
HausaNLP (2023). AfriSenti-Twitter [Dataset]. https://huggingface.co/datasets/HausaNLP/AfriSenti-Twitter
Explore at:
Dataset updated
Feb 19, 2023
Dataset authored and provided by
HausaNLP
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
AfriSenti is the largest sentiment analysis benchmark dataset for under-represented African languages---covering 110,000+ annotated tweets in 14 African languages (Amharic, Algerian Arabic, Hausa, Igbo, Kinyarwanda, Moroccan Arabic, Mozambican Portuguese, Nigerian Pidgin, Oromo, Swahili, Tigrinya, Twi, Xitsonga, and yoruba).
Dataset
figshare.com
application/gzip
Updated Oct 17, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Francisco Donoso (2021). Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.16823500.v1
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.16823500.v1
Dataset updated
Oct 17, 2021
Dataset provided by
Figsharehttp://figshare.com/
Authors
Francisco Donoso
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
text -> Original tweet text, as downloaded trough the Twitter APItext_final -> Text after cleanup, as used for lexicon matchinghour -> Date and time (only hour) at which the tweet was publishedtime_cat -> Dummy. Whether the tweet was published during the event [“cuenta”] or before/after the event [“no_cuenta”]rt_count -> Number of retweets at the moment of the download of the dataurl -> Dummy. The tweet text includes a urlmedia -> Dummy. The tweet had photo or videoEmoWord -> Total number of matches for emotional words or stems in the tweet textMoralWord -> Total number of matches for moral words or stems in the tweet textEmoMoralWord -> Total number of matches for both moral and emotional words or stems in the tweet textOnlyEmoWord -> Number of matches for emotional words or stems in tweet text (excluding those which also matched moral words or stems)OnlyMoralWord -> Number of matches for moral words or stems in tweet text (excluding those which also matched emotional words or stems)foll_div10 -> Number of followers of the account that published the tweet at the moment of the data download, divided by 10,000

Facebook

Twitter

Click to copy link

Link copied

Cite

SHERIF HUSSEIN (2021). Twitter Sentiments Dataset [Dataset]. http://doi.org/10.17632/z9zw7nt5h2.1

Twitter Sentiments Dataset

Explore at:

Unique identifier

https://doi.org/10.17632/z9zw7nt5h2.1

Dataset updated

May 14, 2021

Authors

SHERIF HUSSEIN

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The dataset has three sentiments namely, negative, neutral, and positive. It contains two fields for the tweet and label.

Clear search

Close search

Google apps

Main menu

Twitter Sentiments Dataset

Twitter dataset

financial-tweets-sentiment

twitter_posts

Brussel mobility Twitter sentiment analysis CSV Dataset

Twitter Sentiment Analysis Datasets

Dataset for twitter Sentiment Analysis using Roberta and Vader

Twitter Sentiment Analysis - 1M data

Turkish Tweets Dataset

Sentiment Analysis Dataset

sentiment-analysis-tweet

Swahili-tweet-sentiment

Slovene tweet sentiment analysis

Twitter Sentiment Analysis

Twitter Sentiment Analysis Dataset - Dataset - LDM

Coronavirus (COVID-19) Tweets Sentiment Trend

Tweets Dataset

Data from: IA Tweets Analysis Dataset (Spanish)

AfriSenti-Twitter

Dataset

Twitter Sentiments DatasetSee More Versions

Twitter Sentiments Dataset