100+ datasets found
  1. m

    Twitter Sentiments Dataset

    • data.mendeley.com
    Updated May 14, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SHERIF HUSSEIN (2021). Twitter Sentiments Dataset [Dataset]. http://doi.org/10.17632/z9zw7nt5h2.1
    Explore at:
    Dataset updated
    May 14, 2021
    Authors
    SHERIF HUSSEIN
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset has three sentiments namely, negative, neutral, and positive. It contains two fields for the tweet and label.

  2. Twitter Tweets Sentiment Dataset

    • kaggle.com
    zip
    Updated Apr 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    M Yasser H (2022). Twitter Tweets Sentiment Dataset [Dataset]. https://www.kaggle.com/datasets/yasserh/twitter-tweets-sentiment-dataset
    Explore at:
    zip(1289519 bytes)Available download formats
    Dataset updated
    Apr 8, 2022
    Authors
    M Yasser H
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    https://raw.githubusercontent.com/Masterx-AI/Project_Twitter_Sentiment_Analysis_/main/twitt.jpg" alt="">

    Description:

    Twitter is an online Social Media Platform where people share their their though as tweets. It is observed that some people misuse it to tweet hateful content. Twitter is trying to tackle this problem and we shall help it by creating a strong NLP based-classifier model to distinguish the negative tweets & block such tweets. Can you build a strong classifier model to predict the same?

    Each row contains the text of a tweet and a sentiment label. In the training set you are provided with a word or phrase drawn from the tweet (selected_text) that encapsulates the provided sentiment.

    Make sure, when parsing the CSV, to remove the beginning / ending quotes from the text field, to ensure that you don't include them in your training.

    You're attempting to predict the word or phrase from the tweet that exemplifies the provided sentiment. The word or phrase should include all characters within that span (i.e. including commas, spaces, etc.)

    Columns:

    1. textID - unique ID for each piece of text
    2. text - the text of the tweet
    3. sentiment - the general sentiment of the tweet

    Acknowledgement:

    The dataset is download from Kaggle Competetions:
    https://www.kaggle.com/c/tweet-sentiment-extraction/data?select=train.csv

    Objective:

    • Understand the Dataset & cleanup (if required).
    • Build classification models to predict the twitter sentiments.
    • Compare the evaluation metrics of vaious classification algorithms.
  3. Twitter Sentiment Analysis Dataset

    • kaggle.com
    zip
    Updated Jul 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Durgesh Rao (2023). Twitter Sentiment Analysis Dataset [Dataset]. https://www.kaggle.com/datasets/durgeshrao9993/twitter-analysis-dataset-2022
    Explore at:
    zip(1291530 bytes)Available download formats
    Dataset updated
    Jul 3, 2023
    Authors
    Durgesh Rao
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    The Twitter Sentiment Analysis Dataset is a widely used dataset in the field of natural language processing and sentiment analysis. It consists of a collection of tweets, each labeled with the sentiment expressed in the tweet, which can be positive, negative, or neutral. This dataset is commonly used for training and evaluating machine learning models that aim to automatically analyze and classify the sentiment behind Twitter messages.

    The dataset contains a diverse range of tweets, capturing the opinions, emotions, and attitudes of Twitter users on various topics such as movies, products, events, or general daily experiences. The tweets cover a broad spectrum of sentiments, including expressions of joy, satisfaction, anger, disappointment, sarcasm, or indifference.

  4. Twitter dataset

    • figshare.com
    csv
    Updated Feb 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shreyas Poojary; Mohammed Riza; Rashmi Laxmikant Malghan (2025). Twitter dataset [Dataset]. http://doi.org/10.6084/m9.figshare.28390334.v2
    Explore at:
    csvAvailable download formats
    Dataset updated
    Feb 11, 2025
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Shreyas Poojary; Mohammed Riza; Rashmi Laxmikant Malghan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains tweets labeled for sentiment analysis, categorized into Positive, Negative, and Neutral sentiments. The dataset includes tweet IDs, user metadata, sentiment labels, and tweet text, making it suitable for Natural Language Processing (NLP), machine learning, and AI-based sentiment classification research. Originally sourced from Kaggle, this dataset is curated for improved usability in social media sentiment analysis.

  5. Twitter Sentiment Analysis using Roberta and Vader

    • kaggle.com
    zip
    Updated Oct 18, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jocelyn Dumlao (2023). Twitter Sentiment Analysis using Roberta and Vader [Dataset]. https://www.kaggle.com/datasets/jocelyndumlao/twitter-sentiment-analysis-using-roberta-and-vader
    Explore at:
    zip(32382 bytes)Available download formats
    Dataset updated
    Oct 18, 2023
    Authors
    Jocelyn Dumlao
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Description

    Our dataset comprises 1000 tweets, which were taken from Twitter using the Python programming language. The dataset was stored in a CSV file and generated using various modules. The random module was used to generate random IDs and text, while the faker module was used to generate random user names and dates. Additionally, the textblob module was used to assign a random sentiment to each tweet.

    This systematic approach ensures that the dataset is well-balanced and represents different types of tweets, user behavior, and sentiment. It is essential to have a balanced dataset to ensure that the analysis and visualization of the dataset are accurate and reliable. By generating tweets with a range of sentiments, we have created a diverse dataset that can be used to analyze and visualize sentiment trends and patterns.

    In addition to generating the tweets, we have also prepared a visual representation of the data sets. This visualization provides an overview of the key features of the dataset, such as the frequency distribution of the different sentiment categories, the distribution of tweets over time, and the user names associated with the tweets. This visualization will aid in the initial exploration of the dataset and enable us to identify any patterns or trends that may be present.

    Categories

    Natural Language Processing, Machine Learning Algorithm, Deep Learning

    Acknowledgements & Source

    Jannatul Ferdoshi

    Institutions: BRAC University

    Data Source

    Image Source:Twitter Sentiment Analysis Using Python GeeksforGeeks | lacienciadelcafe.com.ar

    Please don't forget to upvote if you find this useful.

  6. Twitter Sentiment Analysis Datasets

    • brightdata.com
    .json, .csv, .xlsx
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data, Twitter Sentiment Analysis Datasets [Dataset]. https://brightdata.com/products/datasets/twitter/sentiment-analysis
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    Our Twitter Sentiment Analysis Dataset provides a comprehensive collection of tweets, enabling businesses, researchers, and analysts to assess public sentiment, track trends, and monitor brand perception in real time. This dataset includes detailed metadata for each tweet, allowing for in-depth analysis of user engagement, sentiment trends, and social media impact.

    Key Features:
    
      Tweet Content & Metadata: Includes tweet text, hashtags, mentions, media attachments, and engagement metrics such as likes, retweets, and replies.
      Sentiment Classification: Analyze sentiment polarity (positive, negative, neutral) to gauge public opinion on brands, events, and trending topics.
      Author & User Insights: Access user details such as username, profile information, follower count, and account verification status.
      Hashtag & Topic Tracking: Identify trending hashtags and keywords to monitor conversations and sentiment shifts over time.
      Engagement Metrics: Measure tweet performance based on likes, shares, and comments to evaluate audience interaction.
      Historical & Real-Time Data: Choose from historical datasets for trend analysis or real-time data for up-to-date sentiment tracking.
    
    
    Use Cases:
    
      Brand Monitoring & Reputation Management: Track public sentiment around brands, products, and services to manage reputation and customer perception.
      Market Research & Consumer Insights: Analyze consumer opinions on industry trends, competitor performance, and emerging market opportunities.
      Political & Social Sentiment Analysis: Evaluate public opinion on political events, social movements, and global issues.
      AI & Machine Learning Applications: Train sentiment analysis models for natural language processing (NLP) and predictive analytics.
      Advertising & Campaign Performance: Measure the effectiveness of marketing campaigns by analyzing audience engagement and sentiment.
    
    
    
      Our dataset is available in multiple formats (JSON, CSV, Excel) and can be delivered via API, cloud storage (AWS, Google Cloud, Azure), or direct download. 
      Gain valuable insights into social media sentiment and enhance your decision-making with high-quality, structured Twitter data.
    
  7. Brand Sentiment Analysis Dataset (Twitter)

    • kaggle.com
    zip
    Updated Jan 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tushar Paul (2024). Brand Sentiment Analysis Dataset (Twitter) [Dataset]. https://www.kaggle.com/datasets/tusharpaul2001/brand-sentiment-analysis-dataset
    Explore at:
    zip(375745 bytes)Available download formats
    Dataset updated
    Jan 7, 2024
    Authors
    Tushar Paul
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset description Users assessed tweets related to various brands and products, providing evaluations on whether the sentiment conveyed was positive, negative, or neutral. Additionally, if the tweet conveyed any sentiment, contributors identified the specific brand or product targeted by that emotion.

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F11965067%2Fa48606bfcaf80acebbb6edff7895484a%2Fdownload.png?generation=1704673111671747&alt=media" alt="">

    Train Dataset : 8589 rows x 3 columns https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F11965067%2Fe998ba81ca461699a787ff7305486b24%2FTrainDS.JPG?generation=1704672608361793&alt=media" alt="">

    Test Dataset : 504 rows x 1 columns https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F11965067%2F07df18965e91f84df123270aabb641e1%2Ftest.JPG?generation=1704679582009718&alt=media" alt="">

  8. Z

    Brussel mobility Twitter sentiment analysis CSV Dataset

    • data.niaid.nih.gov
    Updated May 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tori, Floriano; Betancur Arenas, Juliana; Ginis, Vincent; van Vessem, Charlotte (2024). Brussel mobility Twitter sentiment analysis CSV Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_11401123
    Explore at:
    Dataset updated
    May 31, 2024
    Authors
    Tori, Floriano; Betancur Arenas, Juliana; Ginis, Vincent; van Vessem, Charlotte
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Brussels
    Description

    SSH CENTRE (Social Sciences and Humanities for Climate, Energy aNd Transport Research Excellence) is a Horizon Europe project, engaging directly with stakeholders across research, policy, and business (including citizens) to strengthen social innovation, SSH-STEM collaboration, transdisciplinary policy advice, inclusive engagement, and SSH communities across Europe, accelerating the EU’s transition to carbon neutrality. SSH CENTRE is based in a range of activities related to Open Science, inclusivity and diversity – especially with regards Southern and Eastern Europe and different career stages – including: development of novel SSH-STEM collaborations to facilitate the delivery of the EU Green Deal; SSH knowledge brokerage to support regions in transition; and the effective design of strategies for citizen engagement in EU R&I activities. Outputs include action-led agendas and building stakeholder synergies through regular Policy Insight events.This is captured in a high-profile virtual SSH CENTRE generating and sharing best practice for SSH policy advice, overcoming fragmentation to accelerate the EU’s journey to a sustainable future.The documents uploaded here are part of WP2 whereby novel, interdisciplinary teams were provided funding to undertake activities to develop a policy recommendation related to EU Green Deal policy. Each of these policy recommendations, and the activities that inform them, will be written-up as a chapter in an edited book collection. Three books will make up this edited collection - one on climate, one on energy and one on mobility. As part of writing a chapter for the SSH CENTRE book on ‘Mobility’, we set out to analyse the sentiment of users on Twitter regarding shared and active mobility modes in Brussels. This involved us collecting tweets between 2017-2022. A tweet was collected if it contained a previously defined mobility keyword (for example: metro) and either the name of a (local) politician, a neighbourhood or municipality, or a (shared) mobility provider. The files attached to this Zenodo webpage is a csv files containing the tweets collected.”.

  9. c

    Sentiment Analysis Dataset

    • cubig.ai
    zip
    Updated May 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CUBIG (2025). Sentiment Analysis Dataset [Dataset]. https://cubig.ai/store/products/270/sentiment-analysis-dataset
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 20, 2025
    Dataset authored and provided by
    CUBIG
    License

    https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

    Measurement technique
    Privacy-preserving data transformation via differential privacy, Synthetic data generation using AI techniques for model training
    Description

    1) Data Introduction • The Sentiment Analysis Dataset is a dataset for emotional analysis, including large-scale tweet text collected from Twitter and emotional polarity (0=negative, 2=neutral, 4=positive) labels for each tweet, featuring automatic labeling based on emoticons.

    2) Data Utilization (1) Sentiment Analysis Dataset has characteristics that: • Each sample consists of six columns: emotional polarity, tweet ID, date of writing, search word, author, and tweet body, and is suitable for training natural language processing and classification models using tweet text and emotion labels. (2) Sentiment Analysis Dataset can be used to: • Emotional Classification Model Development: Using tweet text and emotional polarity labels, we can build positive, negative, and neutral emotional automatic classification models with various machine learning and deep learning models such as logistic regression, SVM, RNN, and LSTM. • Analysis of SNS public opinion and trends: By analyzing the distribution of emotions by time series and keywords, you can explore changes in public opinion on specific issues or brands, positive and negative trends, and key emotional keywords.

  10. Twitter Sentiment Analysis Dataset

    • kaggle.com
    zip
    Updated Aug 16, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tùng Lê Thanh (2023). Twitter Sentiment Analysis Dataset [Dataset]. https://www.kaggle.com/datasets/tungle98/twitter-sentiment-dataset
    Explore at:
    zip(1291530 bytes)Available download formats
    Dataset updated
    Aug 16, 2023
    Authors
    Tùng Lê Thanh
    Description

    Dataset

    This dataset was created by Tùng Lê Thanh

    Contents

  11. Datasets for Sentiment Analysis

    • zenodo.org
    csv
    Updated Dec 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Julie R. Repository creator - Campos Arias; Julie R. Repository creator - Campos Arias (2023). Datasets for Sentiment Analysis [Dataset]. http://doi.org/10.5281/zenodo.10157504
    Explore at:
    csvAvailable download formats
    Dataset updated
    Dec 10, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Julie R. Repository creator - Campos Arias; Julie R. Repository creator - Campos Arias
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository was created for my Master's thesis in Computational Intelligence and Internet of Things at the University of Córdoba, Spain. The purpose of this repository is to store the datasets found that were used in some of the studies that served as research material for this Master's thesis. Also, the datasets used in the experimental part of this work are included.

    Below are the datasets specified, along with the details of their references, authors, and download sources.

    ----------- STS-Gold Dataset ----------------

    The dataset consists of 2026 tweets. The file consists of 3 columns: id, polarity, and tweet. The three columns denote the unique id, polarity index of the text and the tweet text respectively.

    Reference: Saif, H., Fernandez, M., He, Y., & Alani, H. (2013). Evaluation datasets for Twitter sentiment analysis: a survey and a new dataset, the STS-Gold.

    File name: sts_gold_tweet.csv

    ----------- Amazon Sales Dataset ----------------

    This dataset is having the data of 1K+ Amazon Product's Ratings and Reviews as per their details listed on the official website of Amazon. The data was scraped in the month of January 2023 from the Official Website of Amazon.

    Owner: Karkavelraja J., Postgraduate student at Puducherry Technological University (Puducherry, Puducherry, India)

    Features:

    • product_id - Product ID
    • product_name - Name of the Product
    • category - Category of the Product
    • discounted_price - Discounted Price of the Product
    • actual_price - Actual Price of the Product
    • discount_percentage - Percentage of Discount for the Product
    • rating - Rating of the Product
    • rating_count - Number of people who voted for the Amazon rating
    • about_product - Description about the Product
    • user_id - ID of the user who wrote review for the Product
    • user_name - Name of the user who wrote review for the Product
    • review_id - ID of the user review
    • review_title - Short review
    • review_content - Long review
    • img_link - Image Link of the Product
    • product_link - Official Website Link of the Product

    License: CC BY-NC-SA 4.0

    File name: amazon.csv

    ----------- Rotten Tomatoes Reviews Dataset ----------------

    This rating inference dataset is a sentiment classification dataset, containing 5,331 positive and 5,331 negative processed sentences from Rotten Tomatoes movie reviews. On average, these reviews consist of 21 words. The first 5331 rows contains only negative samples and the last 5331 rows contain only positive samples, thus the data should be shuffled before usage.

    This data is collected from https://www.cs.cornell.edu/people/pabo/movie-review-data/ as a txt file and converted into a csv file. The file consists of 2 columns: reviews and labels (1 for fresh (good) and 0 for rotten (bad)).

    Reference: Bo Pang and Lillian Lee. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05), pages 115–124, Ann Arbor, Michigan, June 2005. Association for Computational Linguistics

    File name: data_rt.csv

    ----------- Preprocessed Dataset Sentiment Analysis ----------------

    Preprocessed amazon product review data of Gen3EcoDot (Alexa) scrapped entirely from amazon.in
    Stemmed and lemmatized using nltk.
    Sentiment labels are generated using TextBlob polarity scores.

    The file consists of 4 columns: index, review (stemmed and lemmatized review using nltk), polarity (score) and division (categorical label generated using polarity score).

    DOI: 10.34740/kaggle/dsv/3877817

    Citation: @misc{pradeesh arumadi_2022, title={Preprocessed Dataset Sentiment Analysis}, url={https://www.kaggle.com/dsv/3877817}, DOI={10.34740/KAGGLE/DSV/3877817}, publisher={Kaggle}, author={Pradeesh Arumadi}, year={2022} }

    This dataset was used in the experimental phase of my research.

    File name: EcoPreprocessed.csv

    ----------- Amazon Earphones Reviews ----------------

    This dataset consists of a 9930 Amazon reviews, star ratings, for 10 latest (as of mid-2019) bluetooth earphone devices for learning how to train Machine for sentiment analysis.

    This dataset was employed in the experimental phase of my research. To align it with the objectives of my study, certain reviews were excluded from the original dataset, and an additional column was incorporated into this dataset.

    The file consists of 5 columns: ReviewTitle, ReviewBody, ReviewStar, Product and division (manually added - categorical label generated using ReviewStar score)

    License: U.S. Government Works

    Source: www.amazon.in

    File name (original): AllProductReviews.csv (contains 14337 reviews)

    File name (edited - used for my research) : AllProductReviews2.csv (contains 9930 reviews)

    ----------- Amazon Musical Instruments Reviews ----------------

    This dataset contains 7137 comments/reviews of different musical instruments coming from Amazon.

    This dataset was employed in the experimental phase of my research. To align it with the objectives of my study, certain reviews were excluded from the original dataset, and an additional column was incorporated into this dataset.

    The file consists of 10 columns: reviewerID, asin (ID of the product), reviewerName, helpful (helpfulness rating of the review), reviewText, overall (rating of the product), summary (summary of the review), unixReviewTime (time of the review - unix time), reviewTime (time of the review (raw) and division (manually added - categorical label generated using overall score).

    Source: http://jmcauley.ucsd.edu/data/amazon/

    File name (original): Musical_instruments_reviews.csv (contains 10261 reviews)

    File name (edited - used for my research) : Musical_instruments_reviews2.csv (contains 7137 reviews)

  12. r

    Twitter Sentiment Analysis Dataset

    • resodate.org
    • service.tib.eu
    Updated Nov 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sara Rosenthal; Noura Farra; Preslav Nakov (2024). Twitter Sentiment Analysis Dataset [Dataset]. https://resodate.org/resources/aHR0cHM6Ly9zZXJ2aWNlLnRpYi5ldS9sZG1zZXJ2aWNlL2RhdGFzZXQvdHdpdHRlci1zZW50aW1lbnQtYW5hbHlzaXMtZGF0YXNldA==
    Explore at:
    Dataset updated
    Nov 25, 2024
    Dataset provided by
    Leibniz Data Manager
    Authors
    Sara Rosenthal; Noura Farra; Preslav Nakov
    Description

    The dataset comprises tweets labeled with sentiment ratings in an ordinal five-point scale, including classes for strongly negative, negative, neutral, positive, and strongly positive.

  13. h

    tweet_sentiment_multilingual

    • huggingface.co
    • opendatalab.com
    Updated Dec 25, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cardiff NLP (2022). tweet_sentiment_multilingual [Dataset]. https://huggingface.co/datasets/cardiffnlp/tweet_sentiment_multilingual
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 25, 2022
    Dataset authored and provided by
    Cardiff NLP
    Description

    Dataset Card for cardiffnlp/tweet_sentiment_multilingual

      Dataset Summary
    

    Tweet Sentiment Multilingual consists of sentiment analysis dataset on Twitter in 8 different lagnuages.

    arabic english french german hindi italian portuguese spanish

      Supported Tasks and Leaderboards
    

    text_classification: The dataset can be trained using a SentenceClassification model from HuggingFace transformers.

      Dataset Structure
    
    
    
    
    
      Data Instances
    

    An instance from… See the full description on the dataset page: https://huggingface.co/datasets/cardiffnlp/tweet_sentiment_multilingual.

  14. h

    twitter-sentiment-analysis

    • huggingface.co
    Updated Oct 30, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Md. Abdullah Al Mamun (2025). twitter-sentiment-analysis [Dataset]. https://huggingface.co/datasets/bdstar/twitter-sentiment-analysis
    Explore at:
    Dataset updated
    Oct 30, 2025
    Authors
    Md. Abdullah Al Mamun
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    🐦 Twitter Sentiment Analysis (bdstar/twitter-sentiment-analysis)

      🧠 Overview
    

    A refined and merged version of Twitter text sentiment datasets, providing a clean and well-balanced dataset for sentiment classification across three sentiment categories:positive, negative, and neutral. This dataset is split into three parts — train, test, and validation — each sourced from highly reputable open datasets.It is designed for training, evaluating, and benchmarking NLP models for… See the full description on the dataset page: https://huggingface.co/datasets/bdstar/twitter-sentiment-analysis.

  15. c

    Twitter Tweets Sentiment Dataset

    • cubig.ai
    zip
    Updated Feb 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CUBIG (2025). Twitter Tweets Sentiment Dataset [Dataset]. https://cubig.ai/store/products/142/twitter-tweets-sentiment-dataset
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 25, 2025
    Dataset authored and provided by
    CUBIG
    License

    https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

    Measurement technique
    Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
    Description

    1) Data introduction • Twitter-tweets-sentiment dataset is a dataset that aims to analyze tweet sentiment for Twitter and natural language processing.

    2) Data utilization (1)Twitter-tweets-sentiment data has characteristics that: • The data consists of three columns, including emotion and text, and aims to block negative tweets through a powerful classification model. (2) Twitter-tweets-sentiment data can be used to: • Social Media Monitoring: Businesses and organizations can use data to monitor social media platforms and gauge public sentiment about a brand, product, event, or social issue. • Sentiment analysis: This dataset can be used to train models that classify the sentiment of tweets, which can help companies and researchers understand public opinion on a variety of topics.

  16. h

    twitter-financial-news-sentiment

    • huggingface.co
    • opendatalab.com
    • +1more
    Updated Dec 4, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    not a (2022). twitter-financial-news-sentiment [Dataset]. https://huggingface.co/datasets/zeroshot/twitter-financial-news-sentiment
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 4, 2022
    Authors
    not a
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Description

    The Twitter Financial News dataset is an English-language dataset containing an annotated corpus of finance-related tweets. This dataset is used to classify finance-related tweets for their sentiment.

    The dataset holds 11,932 documents annotated with 3 labels:

    sentiments = { "LABEL_0": "Bearish", "LABEL_1": "Bullish", "LABEL_2": "Neutral" }

    The data was collected using the Twitter API. The current dataset supports the multi-class classification… See the full description on the dataset page: https://huggingface.co/datasets/zeroshot/twitter-financial-news-sentiment.

  17. h

    AfriSenti-Twitter

    • huggingface.co
    • opendatalab.com
    Updated Feb 19, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    HausaNLP (2023). AfriSenti-Twitter [Dataset]. https://huggingface.co/datasets/HausaNLP/AfriSenti-Twitter
    Explore at:
    Dataset updated
    Feb 19, 2023
    Dataset authored and provided by
    HausaNLP
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    AfriSenti is the largest sentiment analysis benchmark dataset for under-represented African languages---covering 110,000+ annotated tweets in 14 African languages (Amharic, Algerian Arabic, Hausa, Igbo, Kinyarwanda, Moroccan Arabic, Mozambican Portuguese, Nigerian Pidgin, Oromo, Swahili, Tigrinya, Twi, Xitsonga, and yoruba).

  18. a

    Sentiment-140 Dataset

    • datasets.activeloop.ai
    • tensorflow.org
    • +2more
    deeplake
    Updated Feb 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alec Go, Richa Bhayani, and Lei Huang (2022). Sentiment-140 Dataset [Dataset]. https://datasets.activeloop.ai/docs/ml/datasets/sentiment-140-dataset/
    Explore at:
    deeplakeAvailable download formats
    Dataset updated
    Feb 3, 2022
    Dataset authored and provided by
    Alec Go, Richa Bhayani, and Lei Huang
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    Jan 1, 2009 - Dec 31, 2009
    Description

    A dataset of tweets with sentiment labels, collected from Twitter in 2009. The dataset contains 1,800,000 tweets, with each tweet having a sentiment label of positive, negative, or neutral. The dataset is a valuable resource for training and evaluating sentiment analysis models.

  19. Z

    Data from: IA Tweets Analysis Dataset (Spanish)

    • data-staging.niaid.nih.gov
    • data.niaid.nih.gov
    Updated Aug 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Guerrero-Contreras, Gabriel; Balderas-Díaz, Sara; Serrano-Fernández, Alejandro; Muñoz, Andrés (2024). IA Tweets Analysis Dataset (Spanish) [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_10821484
    Explore at:
    Dataset updated
    Aug 3, 2024
    Dataset provided by
    University of Cadiz
    Authors
    Guerrero-Contreras, Gabriel; Balderas-Díaz, Sara; Serrano-Fernández, Alejandro; Muñoz, Andrés
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    General Description

    This dataset comprises 4,038 tweets in Spanish, related to discussions about artificial intelligence (AI), and was created and utilized in the publication "Enhancing Sentiment Analysis on Social Media: Integrating Text and Metadata for Refined Insights," (10.1109/IE61493.2024.10599899) presented at the 20th International Conference on Intelligent Environments. It is designed to support research on public perception, sentiment, and engagement with AI topics on social media from a Spanish-speaking perspective. Each entry includes detailed annotations covering sentiment analysis, user engagement metrics, and user profile characteristics, among others.

    Data Collection Method

    Tweets were gathered through the Twitter API v1.1 by targeting keywords and hashtags associated with artificial intelligence, focusing specifically on content in Spanish. The dataset captures a wide array of discussions, offering a holistic view of the Spanish-speaking public's sentiment towards AI.

    Dataset Content

    ID: A unique identifier for each tweet.

    text: The textual content of the tweet. It is a string with a maximum allowed length of 280 characters.

    polarity: The tweet's sentiment polarity (e.g., Positive, Negative, Neutral).

    favorite_count: Indicates how many times the tweet has been liked by Twitter users. It is a non-negative integer.

    retweet_count: The number of times this tweet has been retweeted. It is a non-negative integer.

    user_verified: When true, indicates that the user has a verified account, which helps the public recognize the authenticity of accounts of public interest. It is a boolean data type with two allowed values: True or False.

    user_default_profile: When true, indicates that the user has not altered the theme or background of their user profile. It is a boolean data type with two allowed values: True or False.

    user_has_extended_profile: When true, indicates that the user has an extended profile. An extended profile on Twitter allows users to provide more detailed information about themselves, such as an extended biography, a header image, details about their location, website, and other additional data. It is a boolean data type with two allowed values: True or False.

    user_followers_count: The current number of followers the account has. It is a non-negative integer.

    user_friends_count: The number of users that the account is following. It is a non-negative integer.

    user_favourites_count: The number of tweets this user has liked since the account was created. It is a non-negative integer.

    user_statuses_count: The number of tweets (including retweets) posted by the user. It is a non-negative integer.

    user_protected: When true, indicates that this user has chosen to protect their tweets, meaning their tweets are not publicly visible without their permission. It is a boolean data type with two allowed values: True or False.

    user_is_translator: When true, indicates that the user posting the tweet is a verified translator on Twitter. This means they have been recognized and validated by the platform as translators of content in different languages. It is a boolean data type with two allowed values: True or False.

    Cite as

    Guerrero-Contreras, G., Balderas-Díaz, S., Serrano-Fernández, A., & Muñoz, A. (2024, June). Enhancing Sentiment Analysis on Social Media: Integrating Text and Metadata for Refined Insights. In 2024 International Conference on Intelligent Environments (IE) (pp. 62-69). IEEE.

    Potential Use Cases

    This dataset is aimed at academic researchers and practitioners with interests in:

    Sentiment analysis and natural language processing (NLP) with a focus on AI discussions in the Spanish language.

    Social media analysis on public engagement and perception of artificial intelligence among Spanish speakers.

    Exploring correlations between user engagement metrics and sentiment in discussions about AI.

    Data Format and File Type

    The dataset is provided in CSV format, ensuring compatibility with a wide range of data analysis tools and programming environments.

    License

    The dataset is available under the Creative Commons Attribution 4.0 International (CC BY 4.0) license, permitting sharing, copying, distribution, transmission, and adaptation of the work for any purpose, including commercial, provided proper attribution is given.

  20. m

    The Climate Change Twitter Dataset

    • data.mendeley.com
    • kaggle.com
    Updated May 19, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dimitrios Effrosynidis (2022). The Climate Change Twitter Dataset [Dataset]. http://doi.org/10.17632/mw8yd7z9wc.2
    Explore at:
    Dataset updated
    May 19, 2022
    Authors
    Dimitrios Effrosynidis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    If you use the dataset, cite the paper: https://doi.org/10.1016/j.eswa.2022.117541

    The most comprehensive dataset to date regarding climate change and human opinions via Twitter. It has the heftiest temporal coverage, spanning over 13 years, includes over 15 million tweets spatially distributed across the world, and provides the geolocation of most tweets. Seven dimensions of information are tied to each tweet, namely geolocation, user gender, climate change stance and sentiment, aggressiveness, deviations from historic temperature, and topic modeling, while accompanied by environmental disaster events information. These dimensions were produced by testing and evaluating a plethora of state-of-the-art machine learning algorithms and methods, both supervised and unsupervised, including BERT, RNN, LSTM, CNN, SVM, Naive Bayes, VADER, Textblob, Flair, and LDA.

    The following columns are in the dataset:

    ➡ created_at: The timestamp of the tweet. ➡ id: The unique id of the tweet. ➡ lng: The longitude the tweet was written. ➡ lat: The latitude the tweet was written. ➡ topic: Categorization of the tweet in one of ten topics namely, seriousness of gas emissions, importance of human intervention, global stance, significance of pollution awareness events, weather extremes, impact of resource overconsumption, Donald Trump versus science, ideological positions on global warming, politics, and undefined. ➡ sentiment: A score on a continuous scale. This scale ranges from -1 to 1 with values closer to 1 being translated to positive sentiment, values closer to -1 representing a negative sentiment while values close to 0 depicting no sentiment or being neutral. ➡ stance: That is if the tweet supports the belief of man-made climate change (believer), if the tweet does not believe in man-made climate change (denier), and if the tweet neither supports nor refuses the belief of man-made climate change (neutral). ➡ gender: Whether the user that made the tweet is male, female, or undefined. ➡ temperature_avg: The temperature deviation in Celsius and relative to the January 1951-December 1980 average at the time and place the tweet was written. ➡ aggressiveness: That is if the tweet contains aggressive language or not.

    Since Twitter forbids making public the text of the tweets, in order to retrieve it you need to do a process called hydrating. Tools such as Twarc or Hydrator can be used to hydrate tweets.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
SHERIF HUSSEIN (2021). Twitter Sentiments Dataset [Dataset]. http://doi.org/10.17632/z9zw7nt5h2.1

Twitter Sentiments Dataset

Explore at:
Dataset updated
May 14, 2021
Authors
SHERIF HUSSEIN
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The dataset has three sentiments namely, negative, neutral, and positive. It contains two fields for the tweet and label.

Search
Clear search
Close search
Google apps
Main menu