100+ datasets found

m
Twitter Sentiments Dataset
data.mendeley.com
Updated May 14, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SHERIF HUSSEIN (2021). Twitter Sentiments Dataset [Dataset]. http://doi.org/10.17632/z9zw7nt5h2.1
Explore at:
Unique identifier
https://doi.org/10.17632/z9zw7nt5h2.1
Dataset updated
May 14, 2021
Authors
SHERIF HUSSEIN
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset has three sentiments namely, negative, neutral, and positive. It contains two fields for the tweet and label.
Twitter dataset
figshare.com
csv
Updated Feb 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shreyas Poojary; Mohammed Riza; Rashmi Laxmikant Malghan (2025). Twitter dataset [Dataset]. http://doi.org/10.6084/m9.figshare.28390334.v2
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.28390334.v2
Dataset updated
Feb 11, 2025
Dataset provided by
figshare
Authors
Shreyas Poojary; Mohammed Riza; Rashmi Laxmikant Malghan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains tweets labeled for sentiment analysis, categorized into Positive, Negative, and Neutral sentiments. The dataset includes tweet IDs, user metadata, sentiment labels, and tweet text, making it suitable for Natural Language Processing (NLP), machine learning, and AI-based sentiment classification research. Originally sourced from Kaggle, this dataset is curated for improved usability in social media sentiment analysis.
Z
Brussel mobility Twitter sentiment analysis CSV Dataset
data.niaid.nih.gov
zenodo.org
Updated May 31, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tori, Floriano; Betancur Arenas, Juliana; Ginis, Vincent; van Vessem, Charlotte (2024). Brussel mobility Twitter sentiment analysis CSV Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_11401123
Explore at:
Dataset updated
May 31, 2024
Authors
Tori, Floriano; Betancur Arenas, Juliana; Ginis, Vincent; van Vessem, Charlotte
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Brussels
Description
SSH CENTRE (Social Sciences and Humanities for Climate, Energy aNd Transport Research Excellence) is a Horizon Europe project, engaging directly with stakeholders across research, policy, and business (including citizens) to strengthen social innovation, SSH-STEM collaboration, transdisciplinary policy advice, inclusive engagement, and SSH communities across Europe, accelerating the EU’s transition to carbon neutrality. SSH CENTRE is based in a range of activities related to Open Science, inclusivity and diversity – especially with regards Southern and Eastern Europe and different career stages – including: development of novel SSH-STEM collaborations to facilitate the delivery of the EU Green Deal; SSH knowledge brokerage to support regions in transition; and the effective design of strategies for citizen engagement in EU R&I activities. Outputs include action-led agendas and building stakeholder synergies through regular Policy Insight events.This is captured in a high-profile virtual SSH CENTRE generating and sharing best practice for SSH policy advice, overcoming fragmentation to accelerate the EU’s journey to a sustainable future.The documents uploaded here are part of WP2 whereby novel, interdisciplinary teams were provided funding to undertake activities to develop a policy recommendation related to EU Green Deal policy. Each of these policy recommendations, and the activities that inform them, will be written-up as a chapter in an edited book collection. Three books will make up this edited collection - one on climate, one on energy and one on mobility. As part of writing a chapter for the SSH CENTRE book on ‘Mobility’, we set out to analyse the sentiment of users on Twitter regarding shared and active mobility modes in Brussels. This involved us collecting tweets between 2017-2022. A tweet was collected if it contained a previously defined mobility keyword (for example: metro) and either the name of a (local) politician, a neighbourhood or municipality, or a (shared) mobility provider. The files attached to this Zenodo webpage is a csv files containing the tweets collected.”.
Twitter Sentiment Analysis
kaggle.com
Updated Apr 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
raj713335 (2023). Twitter Sentiment Analysis [Dataset]. https://www.kaggle.com/datasets/raj713335/twittesentimentanalysis/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 16, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
raj713335
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
About Dataset

Context

This is the Twitter Sentiment Analysis dataset. It contains 1 Million tweets extracted using the Twitter Opensource API. The tweets have been annotated (0 = negative, 4 = positive) and they can be used primarily to detect sentiment.

Content It contains the following 6 fields:

target: the polarity of the tweet (0 = negative, 2 = neutral, 4 = positive)

ids: The id of the tweet ( 2087)

date: the date of the tweet (Sat April 15 23:58:44 UTC 2023)

flag: The query (lyx). If there is no query, then this value is NO_QUERY.

user: The user that tweeted (raj713335)

**text: **the text of the tweet (Lyx is cool)

Acknowledgments The official link regarding the dataset with resources about how it was generated is here The official paper detailing the approach is here

Citation: Go, A., Bhayani, R. and Huang, L., 2009. Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford, 1(2009), p.12.

Inspiration To detect severity from tweets. You may have a look at this.

Twitter Sentiment Analysis Datasets

brightdata.com

.json, .csv, .xlsx

Updated Jul 4, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Bright Data (2024). Twitter Sentiment Analysis Datasets [Dataset]. https://brightdata.com/products/datasets/twitter/sentiment-analysis

Explore at:

.json, .csv, .xlsxAvailable download formats

Dataset updated

Jul 4, 2024

Dataset authored and provided by

Bright Datahttps://brightdata.com/

License

https://brightdata.com/licensehttps://brightdata.com/license

Area covered

Worldwide

Description

Our Twitter Sentiment Analysis Dataset provides a comprehensive collection of tweets, enabling businesses, researchers, and analysts to assess public sentiment, track trends, and monitor brand perception in real time. This dataset includes detailed metadata for each tweet, allowing for in-depth analysis of user engagement, sentiment trends, and social media impact.

Key Features:

  Tweet Content & Metadata: Includes tweet text, hashtags, mentions, media attachments, and engagement metrics such as likes, retweets, and replies.
  Sentiment Classification: Analyze sentiment polarity (positive, negative, neutral) to gauge public opinion on brands, events, and trending topics.
  Author & User Insights: Access user details such as username, profile information, follower count, and account verification status.
  Hashtag & Topic Tracking: Identify trending hashtags and keywords to monitor conversations and sentiment shifts over time.
  Engagement Metrics: Measure tweet performance based on likes, shares, and comments to evaluate audience interaction.
  Historical & Real-Time Data: Choose from historical datasets for trend analysis or real-time data for up-to-date sentiment tracking.


Use Cases:

  Brand Monitoring & Reputation Management: Track public sentiment around brands, products, and services to manage reputation and customer perception.
  Market Research & Consumer Insights: Analyze consumer opinions on industry trends, competitor performance, and emerging market opportunities.
  Political & Social Sentiment Analysis: Evaluate public opinion on political events, social movements, and global issues.
  AI & Machine Learning Applications: Train sentiment analysis models for natural language processing (NLP) and predictive analytics.
  Advertising & Campaign Performance: Measure the effectiveness of marketing campaigns by analyzing audience engagement and sentiment.



  Our dataset is available in multiple formats (JSON, CSV, Excel) and can be delivered via API, cloud storage (AWS, Google Cloud, Azure), or direct download. 
  Gain valuable insights into social media sentiment and enhance your decision-making with high-quality, structured Twitter data.

h
tweet_sentiment_multilingual
huggingface.co
opendatalab.com
Updated Dec 25, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cardiff NLP (2022). tweet_sentiment_multilingual [Dataset]. https://huggingface.co/datasets/cardiffnlp/tweet_sentiment_multilingual
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 25, 2022
Dataset authored and provided by
Cardiff NLP
Description
Dataset Card for cardiffnlp/tweet_sentiment_multilingual

Dataset Summary

Tweet Sentiment Multilingual consists of sentiment analysis dataset on Twitter in 8 different lagnuages.

arabic english french german hindi italian portuguese spanish

Supported Tasks and Leaderboards

text_classification: The dataset can be trained using a SentenceClassification model from HuggingFace transformers.

Dataset Structure Data Instances

An instance from… See the full description on the dataset page: https://huggingface.co/datasets/cardiffnlp/tweet_sentiment_multilingual.
Twitter Sentiment Analysis
kaggle.com
Updated Sep 30, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shanks0465 (2020). Twitter Sentiment Analysis [Dataset]. https://www.kaggle.com/shanks0465/twitter-sentiment-analysis/activity
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 30, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Shanks0465
Description
Context

Twitter Sentiment Analysis Dataset especially for classification using Logistic Regression

Content

tweet - Preprocessed token array for each tweet (Preprocessing done are remove hyperlinks, remove hashtags, remove stop words and punctuation)

bias - Just a simple bias value (default 1)

pos - Sum of positive frequencies of each word in the tweet tokens.

neg - Sum of negative frequencies of each word in the tweet tokens.

label - 1.0 for Positive Tweet and 0.0 for Negative Tweet.

Acknowledgements

This dataset was part of the Week 1 Labs of Coursera Natural Language Processing Course. This dataset was custom created from scratch using NLTL Library for text preprocessing and all functions for preprocessing were from scratch.
r
Twitter Sentiment Analysis Dataset
resodate.org
service.tib.eu
Updated Nov 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sara Rosenthal; Noura Farra; Preslav Nakov (2024). Twitter Sentiment Analysis Dataset [Dataset]. https://resodate.org/resources/aHR0cHM6Ly9zZXJ2aWNlLnRpYi5ldS9sZG1zZXJ2aWNlL2RhdGFzZXQvdHdpdHRlci1zZW50aW1lbnQtYW5hbHlzaXMtZGF0YXNldA==
Explore at:
Dataset updated
Nov 25, 2024
Dataset provided by
Leibniz Data Manager
Authors
Sara Rosenthal; Noura Farra; Preslav Nakov
Description
The dataset comprises tweets labeled with sentiment ratings in an ordinal five-point scale, including classes for strongly negative, negative, neutral, positive, and strongly positive.
h
AfriSenti-Twitter
huggingface.co
opendatalab.com
Updated Feb 19, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
HausaNLP (2023). AfriSenti-Twitter [Dataset]. https://huggingface.co/datasets/HausaNLP/AfriSenti-Twitter
Explore at:
Dataset updated
Feb 19, 2023
Dataset authored and provided by
HausaNLP
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
AfriSenti is the largest sentiment analysis benchmark dataset for under-represented African languages---covering 110,000+ annotated tweets in 14 African languages (Amharic, Algerian Arabic, Hausa, Igbo, Kinyarwanda, Moroccan Arabic, Mozambican Portuguese, Nigerian Pidgin, Oromo, Swahili, Tigrinya, Twi, Xitsonga, and yoruba).
c
Twitter Tweets Sentiment Dataset
cubig.ai
zip
Updated Feb 25, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CUBIG (2025). Twitter Tweets Sentiment Dataset [Dataset]. https://cubig.ai/store/products/142/twitter-tweets-sentiment-dataset
Explore at:
zipAvailable download formats
Dataset updated
Feb 25, 2025
Dataset authored and provided by
CUBIG
License
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
Measurement technique
Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
Description
1) Data introduction • Twitter-tweets-sentiment dataset is a dataset that aims to analyze tweet sentiment for Twitter and natural language processing.

2) Data utilization (1)Twitter-tweets-sentiment data has characteristics that: • The data consists of three columns, including emotion and text, and aims to block negative tweets through a powerful classification model. (2) Twitter-tweets-sentiment data can be used to: • Social Media Monitoring: Businesses and organizations can use data to monitor social media platforms and gauge public sentiment about a brand, product, event, or social issue. • Sentiment analysis: This dataset can be used to train models that classify the sentiment of tweets, which can help companies and researchers understand public opinion on a variety of topics.
h
financial-tweets-sentiment
huggingface.co
Updated Dec 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tim Koornstra (2023). financial-tweets-sentiment [Dataset]. https://huggingface.co/datasets/TimKoornstra/financial-tweets-sentiment
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 15, 2023
Authors
Tim Koornstra
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Financial Sentiment Analysis Dataset

Overview

This dataset is a comprehensive collection of tweets focused on financial topics, meticulously curated to assist in sentiment analysis in the domain of finance and stock markets. It serves as a valuable resource for training machine learning models to understand and predict sentiment trends based on social media discourse, particularly within the financial sector.

Data Description

The dataset comprises tweets… See the full description on the dataset page: https://huggingface.co/datasets/TimKoornstra/financial-tweets-sentiment.
Bitcoin tweets - Market Sentiment
kaggle.com
Updated Aug 29, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gaurav Dutta (2021). Bitcoin tweets - Market Sentiment [Dataset]. https://www.kaggle.com/datasets/gauravduttakiit/bitcoin-tweets-16m-tweets-with-sentiment-tagged
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 29, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Gaurav Dutta
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

Scrapped from twitters from 2016-01-01 to 2019-03-29, Collecting Tweets containing Bitcoin or BTC Tools used:

Twint Tweepy

Content

Tweet in multiple Language & Talked about Bitcoin

Acknowledgements

Thanks to Alex ( https://www.kaggle.com/alaix14 ) for his dataset (https://www.kaggle.com/alaix14/bitcoin-tweets-20160101-to-20190329 ), It is just an additional dimension where Sentiment is analyzed with a price change for Bitcoin
Twitter Sentiment Analysis Data
figshare.com
xls
Updated Dec 6, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Effie Chen (2019). Twitter Sentiment Analysis Data [Dataset]. http://doi.org/10.6084/m9.figshare.9770807.v2
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.9770807.v2
Dataset updated
Dec 6, 2019
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Effie Chen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This excel work book includes NRC sentiment analysis for all hashtags, #pride tweets, #lesbian tweets, #pride NRC scores, # lesbian NRC scores, all sentiment scores in the syuzhet package for #pride and lesbian, lexicon comparison, #lesbian subsamples and #pride subsamples.
Z
Data from: IA Tweets Analysis Dataset (Spanish)
data.niaid.nih.gov
Updated Aug 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Serrano-Fernández, Alejandro (2024). IA Tweets Analysis Dataset (Spanish) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10821484
Explore at:
Dataset updated
Aug 3, 2024
Dataset provided by
University of Cadiz
Authors
Guerrero-Contreras, Gabriel; Balderas-Díaz, Sara; Serrano-Fernández, Alejandro; Muñoz, Andrés
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
General Description

This dataset comprises 4,038 tweets in Spanish, related to discussions about artificial intelligence (AI), and was created and utilized in the publication "Enhancing Sentiment Analysis on Social Media: Integrating Text and Metadata for Refined Insights," (10.1109/IE61493.2024.10599899) presented at the 20th International Conference on Intelligent Environments. It is designed to support research on public perception, sentiment, and engagement with AI topics on social media from a Spanish-speaking perspective. Each entry includes detailed annotations covering sentiment analysis, user engagement metrics, and user profile characteristics, among others.

Data Collection Method

Tweets were gathered through the Twitter API v1.1 by targeting keywords and hashtags associated with artificial intelligence, focusing specifically on content in Spanish. The dataset captures a wide array of discussions, offering a holistic view of the Spanish-speaking public's sentiment towards AI.

Dataset Content

ID: A unique identifier for each tweet.

text: The textual content of the tweet. It is a string with a maximum allowed length of 280 characters.

polarity: The tweet's sentiment polarity (e.g., Positive, Negative, Neutral).

favorite_count: Indicates how many times the tweet has been liked by Twitter users. It is a non-negative integer.

retweet_count: The number of times this tweet has been retweeted. It is a non-negative integer.

user_verified: When true, indicates that the user has a verified account, which helps the public recognize the authenticity of accounts of public interest. It is a boolean data type with two allowed values: True or False.

user_default_profile: When true, indicates that the user has not altered the theme or background of their user profile. It is a boolean data type with two allowed values: True or False.

user_has_extended_profile: When true, indicates that the user has an extended profile. An extended profile on Twitter allows users to provide more detailed information about themselves, such as an extended biography, a header image, details about their location, website, and other additional data. It is a boolean data type with two allowed values: True or False.

user_followers_count: The current number of followers the account has. It is a non-negative integer.

user_friends_count: The number of users that the account is following. It is a non-negative integer.

user_favourites_count: The number of tweets this user has liked since the account was created. It is a non-negative integer.

user_statuses_count: The number of tweets (including retweets) posted by the user. It is a non-negative integer.

user_protected: When true, indicates that this user has chosen to protect their tweets, meaning their tweets are not publicly visible without their permission. It is a boolean data type with two allowed values: True or False.

user_is_translator: When true, indicates that the user posting the tweet is a verified translator on Twitter. This means they have been recognized and validated by the platform as translators of content in different languages. It is a boolean data type with two allowed values: True or False.

Cite as

Guerrero-Contreras, G., Balderas-Díaz, S., Serrano-Fernández, A., & Muñoz, A. (2024, June). Enhancing Sentiment Analysis on Social Media: Integrating Text and Metadata for Refined Insights. In 2024 International Conference on Intelligent Environments (IE) (pp. 62-69). IEEE.

Potential Use Cases

This dataset is aimed at academic researchers and practitioners with interests in:

Sentiment analysis and natural language processing (NLP) with a focus on AI discussions in the Spanish language.

Social media analysis on public engagement and perception of artificial intelligence among Spanish speakers.

Exploring correlations between user engagement metrics and sentiment in discussions about AI.

Data Format and File Type

The dataset is provided in CSV format, ensuring compatibility with a wide range of data analysis tools and programming environments.

License

The dataset is available under the Creative Commons Attribution 4.0 International (CC BY 4.0) license, permitting sharing, copying, distribution, transmission, and adaptation of the work for any purpose, including commercial, provided proper attribution is given.
m
The Climate Change Twitter Dataset
data.mendeley.com
Updated May 19, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dimitrios Effrosynidis (2022). The Climate Change Twitter Dataset [Dataset]. http://doi.org/10.17632/mw8yd7z9wc.2
Explore at:
Unique identifier
https://doi.org/10.17632/mw8yd7z9wc.2
Dataset updated
May 19, 2022
Authors
Dimitrios Effrosynidis
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
If you use the dataset, cite the paper: https://doi.org/10.1016/j.eswa.2022.117541

The most comprehensive dataset to date regarding climate change and human opinions via Twitter. It has the heftiest temporal coverage, spanning over 13 years, includes over 15 million tweets spatially distributed across the world, and provides the geolocation of most tweets. Seven dimensions of information are tied to each tweet, namely geolocation, user gender, climate change stance and sentiment, aggressiveness, deviations from historic temperature, and topic modeling, while accompanied by environmental disaster events information. These dimensions were produced by testing and evaluating a plethora of state-of-the-art machine learning algorithms and methods, both supervised and unsupervised, including BERT, RNN, LSTM, CNN, SVM, Naive Bayes, VADER, Textblob, Flair, and LDA.

The following columns are in the dataset:

➡ created_at: The timestamp of the tweet. ➡ id: The unique id of the tweet. ➡ lng: The longitude the tweet was written. ➡ lat: The latitude the tweet was written. ➡ topic: Categorization of the tweet in one of ten topics namely, seriousness of gas emissions, importance of human intervention, global stance, significance of pollution awareness events, weather extremes, impact of resource overconsumption, Donald Trump versus science, ideological positions on global warming, politics, and undefined. ➡ sentiment: A score on a continuous scale. This scale ranges from -1 to 1 with values closer to 1 being translated to positive sentiment, values closer to -1 representing a negative sentiment while values close to 0 depicting no sentiment or being neutral. ➡ stance: That is if the tweet supports the belief of man-made climate change (believer), if the tweet does not believe in man-made climate change (denier), and if the tweet neither supports nor refuses the belief of man-made climate change (neutral). ➡ gender: Whether the user that made the tweet is male, female, or undefined. ➡ temperature_avg: The temperature deviation in Celsius and relative to the January 1951-December 1980 average at the time and place the tweet was written. ➡ aggressiveness: That is if the tweet contains aggressive language or not.

Since Twitter forbids making public the text of the tweets, in order to retrieve it you need to do a process called hydrating. Tools such as Twarc or Hydrator can be used to hydrate tweets.
a
Sentiment-140 Dataset
datasets.activeloop.ai
tensorflow.org
+2more
deeplake
Updated Feb 3, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alec Go, Richa Bhayani, and Lei Huang (2022). Sentiment-140 Dataset [Dataset]. https://datasets.activeloop.ai/docs/ml/datasets/sentiment-140-dataset/
Explore at:
deeplakeAvailable download formats
Dataset updated
Feb 3, 2022
Dataset authored and provided by
Alec Go, Richa Bhayani, and Lei Huang
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Time period covered
Jan 1, 2009 - Dec 31, 2009
Description
A dataset of tweets with sentiment labels, collected from Twitter in 2009. The dataset contains 1,800,000 tweets, with each tweet having a sentiment label of positive, negative, or neutral. The dataset is a valuable resource for training and evaluating sentiment analysis models.
i
Coronavirus (COVID-19) Tweets Sentiment Trend
ieee-dataport.org
Updated Nov 4, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rabindra Lamsal (2022). Coronavirus (COVID-19) Tweets Sentiment Trend [Dataset]. https://ieee-dataport.org/open-access/coronavirus-covid-19-tweets-sentiment-trend
Explore at:
Dataset updated
Nov 4, 2022
Authors
Rabindra Lamsal
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset gives a cursory glimpse at the overall sentiment trend of the public discourse regarding the COVID-19 pandemic on Twitter. The live scatter plot of this dataset is available as The Overall Trend block at https://live.rlamsal.com.np. The trend graph reveals multiple peaks and drops that need further analysis. The n-grams during those peaks and drops can prove beneficial for better understanding the discourse.
u
Data from: IA Tweets Analysis Dataset (Spanish)
produccioncientifica.uca.es
Updated 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Guerrero-Contreras, Gabriel; Balderas-Díaz, Sara; Serrano-Fernández, Alejandro; Muñoz, Andrés; Guerrero-Contreras, Gabriel; Balderas-Díaz, Sara; Serrano-Fernández, Alejandro; Muñoz, Andrés (2024). IA Tweets Analysis Dataset (Spanish) [Dataset]. https://produccioncientifica.uca.es/documentos/67321e53aea56d4af04854c2
Explore at:
Dataset updated
2024
Authors
Guerrero-Contreras, Gabriel; Balderas-Díaz, Sara; Serrano-Fernández, Alejandro; Muñoz, Andrés; Guerrero-Contreras, Gabriel; Balderas-Díaz, Sara; Serrano-Fernández, Alejandro; Muñoz, Andrés
Description
Cite as

Guerrero-Contreras, G., Balderas-Díaz, S., Serrano-Fernández, A., & Muñoz, A. (2024, June). Enhancing Sentiment Analysis on Social Media: Integrating Text and Metadata for Refined Insights. In 2024 International Conference on Intelligent Environments (IE) (pp. 62-69). IEEE.

General Description

This dataset comprises 4,038 tweets in Spanish, related to discussions about artificial intelligence (AI), and was created and utilized in the publication "Enhancing Sentiment Analysis on Social Media: Integrating Text and Metadata for Refined Insights," (10.1109/IE61493.2024.10599899) presented at the 20th International Conference on Intelligent Environments. It is designed to support research on public perception, sentiment, and engagement with AI topics on social media from a Spanish-speaking perspective. Each entry includes detailed annotations covering sentiment analysis, user engagement metrics, and user profile characteristics, among others.

Data Collection Method

Tweets were gathered through the Twitter API v1.1 by targeting keywords and hashtags associated with artificial intelligence, focusing specifically on content in Spanish. The dataset captures a wide array of discussions, offering a holistic view of the Spanish-speaking public's sentiment towards AI.

Dataset Content

ID: A unique identifier for each tweet.

text: The textual content of the tweet. It is a string with a maximum allowed length of 280 characters.

polarity: The tweet's sentiment polarity (e.g., Positive, Negative, Neutral).

favorite_count: Indicates how many times the tweet has been liked by Twitter users. It is a non-negative integer.

retweet_count: The number of times this tweet has been retweeted. It is a non-negative integer.

user_verified: When true, indicates that the user has a verified account, which helps the public recognize the authenticity of accounts of public interest. It is a boolean data type with two allowed values: True or False.

user_default_profile: When true, indicates that the user has not altered the theme or background of their user profile. It is a boolean data type with two allowed values: True or False.

user_has_extended_profile: When true, indicates that the user has an extended profile. An extended profile on Twitter allows users to provide more detailed information about themselves, such as an extended biography, a header image, details about their location, website, and other additional data. It is a boolean data type with two allowed values: True or False.

user_followers_count: The current number of followers the account has. It is a non-negative integer.

user_friends_count: The number of users that the account is following. It is a non-negative integer.

user_favourites_count: The number of tweets this user has liked since the account was created. It is a non-negative integer.

user_statuses_count: The number of tweets (including retweets) posted by the user. It is a non-negative integer.

user_protected: When true, indicates that this user has chosen to protect their tweets, meaning their tweets are not publicly visible without their permission. It is a boolean data type with two allowed values: True or False.

user_is_translator: When true, indicates that the user posting the tweet is a verified translator on Twitter. This means they have been recognized and validated by the platform as translators of content in different languages. It is a boolean data type with two allowed values: True or False.

Potential Use Cases

This dataset is aimed at academic researchers and practitioners with interests in:

Sentiment analysis and natural language processing (NLP) with a focus on AI discussions in the Spanish language.

Social media analysis on public engagement and perception of artificial intelligence among Spanish speakers.

Exploring correlations between user engagement metrics and sentiment in discussions about AI.

Data Format and File Type

The dataset is provided in CSV format, ensuring compatibility with a wide range of data analysis tools and programming environments.

License

The dataset is available under the Creative Commons Attribution 4.0 International (CC BY 4.0) license, permitting sharing, copying, distribution, transmission, and adaptation of the work for any purpose, including commercial, provided proper attribution is given.
t
Sentiment Prediction Outputs for Twitter Dataset
test.researchdata.tuwien.at
bin, csv, png, txt
Updated May 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hachem Bouhamidi; Hachem Bouhamidi; Hachem Bouhamidi; Hachem Bouhamidi (2025). Sentiment Prediction Outputs for Twitter Dataset [Dataset]. http://doi.org/10.70124/c8v83-0sy11
Explore at:
bin, csv, png, txtAvailable download formats
Unique identifier
https://doi.org/10.70124/c8v83-0sy11
Dataset updated
May 20, 2025
Dataset provided by
TU Wien
Authors
Hachem Bouhamidi; Hachem Bouhamidi; Hachem Bouhamidi; Hachem Bouhamidi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Apr 28, 2025
Description
Context and Methodology:

This dataset was created as part of a sentiment analysis project using enriched Twitter data. The objective was to train and test a machine learning model to automatically classify the sentiment of tweets (e.g., Positive, Negative, Neutral).
The data was generated using tweets that were sentiment-scored with a custom sentiment scorer. A machine learning pipeline was applied, including text preprocessing, feature extraction with CountVectorizer, and prediction with a HistGradientBoostingClassifier.

Technical Details:

The dataset includes five main files:

test_predictions_full.csv – Predicted sentiment labels for the test set.

sentiment_model.joblib – Trained machine learning model.

count_vectorizer.joblib – Text feature extraction model (CountVectorizer).

model_performance.txt – Evaluation metrics and performance report of the trained model.

confusion_matrix.png – Visualization of the model’s confusion matrix.

The files follow standard naming conventions based on their purpose.
The .joblib files can be loaded into Python using the joblib and scikit-learn libraries.
The .csv,.txt, and .png files can be opened with any standard text reader, spreadsheet software, or image viewer.
Additional performance documentation is included within the model_performance.txt file.

Additional Details:

The data was constructed to ensure reproducibility.

No personal or sensitive information is present.

It can be reused by researchers, data scientists, and students interested in Natural Language Processing (NLP), machine learning classification, and sentiment analysis tasks.
Z
Data from: Arabic news credibility on Twitter using sentiment analysis and...
data.niaid.nih.gov
Updated Jun 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Samdani, Duha; Taileb, Mounira; Almani, Nada (2023). Arabic news credibility on Twitter using sentiment analysis and ensemble learning [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8000716
Explore at:
Dataset updated
Jun 3, 2023
Authors
Samdani, Duha; Taileb, Mounira; Almani, Nada
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Arabic news credibility on Twitter using sentiment analysis and ensemble learning.

WHAT IS IT?

an Arabic news credibility model on Twitter using sentiment analysis and ensemble learning.

Here we include the Collected dataset and the source code of the proposed model written in Python language and using Keras library with Tensorflow backend.

Required Packages

Keras (https://keras.io/).

Scikit-learn (http://scikit-learn.org/)

Imnlearn (imbalanced-learn documentation — Version 0.10.1)

To Run the model

One data file is required to run the model which are:

The data that were used are the collected dataset in the file, set the path of the required data file in the code.

The dataset

There are the dataset file with all features, you can choose the features that you need and apply it on the model.

There are a description file that describe each feature in the news credibility dataset

The file Tweet_ID contains the list of tweets id in the dataset.

The annotated replies based on credibility is provided.

CONTACTS

If you want to report bugs or have general queries email to

Facebook

Twitter

Click to copy link

Link copied

Cite

SHERIF HUSSEIN (2021). Twitter Sentiments Dataset [Dataset]. http://doi.org/10.17632/z9zw7nt5h2.1

Twitter Sentiments Dataset

Explore at:

Unique identifier

https://doi.org/10.17632/z9zw7nt5h2.1

Dataset updated

May 14, 2021

Authors

SHERIF HUSSEIN

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The dataset has three sentiments namely, negative, neutral, and positive. It contains two fields for the tweet and label.

Clear search

Close search

Google apps

Main menu

Twitter Sentiments Dataset

Twitter dataset

Brussel mobility Twitter sentiment analysis CSV Dataset

Twitter Sentiment Analysis

Twitter Sentiment Analysis Datasets

tweet_sentiment_multilingual

Twitter Sentiment Analysis

Twitter Sentiment Analysis Dataset

AfriSenti-Twitter

Twitter Tweets Sentiment Dataset

financial-tweets-sentiment

Bitcoin tweets - Market Sentiment

Context

Content

Acknowledgements

Twitter Sentiment Analysis Data

Data from: IA Tweets Analysis Dataset (Spanish)

The Climate Change Twitter Dataset

Sentiment-140 Dataset

Coronavirus (COVID-19) Tweets Sentiment Trend

Data from: IA Tweets Analysis Dataset (Spanish)

Sentiment Prediction Outputs for Twitter Dataset

Context and Methodology:

Technical Details:

Additional Details:

Data from: Arabic news credibility on Twitter using sentiment analysis and...

Twitter Sentiments DatasetSee More Versions

Twitter Sentiments Dataset