100+ datasets found
  1. Twitter Tweets Sentiment Dataset

    • kaggle.com
    zip
    Updated Apr 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    M Yasser H (2022). Twitter Tweets Sentiment Dataset [Dataset]. https://www.kaggle.com/datasets/yasserh/twitter-tweets-sentiment-dataset
    Explore at:
    zip(1289519 bytes)Available download formats
    Dataset updated
    Apr 8, 2022
    Authors
    M Yasser H
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    https://raw.githubusercontent.com/Masterx-AI/Project_Twitter_Sentiment_Analysis_/main/twitt.jpg" alt="">

    Description:

    Twitter is an online Social Media Platform where people share their their though as tweets. It is observed that some people misuse it to tweet hateful content. Twitter is trying to tackle this problem and we shall help it by creating a strong NLP based-classifier model to distinguish the negative tweets & block such tweets. Can you build a strong classifier model to predict the same?

    Each row contains the text of a tweet and a sentiment label. In the training set you are provided with a word or phrase drawn from the tweet (selected_text) that encapsulates the provided sentiment.

    Make sure, when parsing the CSV, to remove the beginning / ending quotes from the text field, to ensure that you don't include them in your training.

    You're attempting to predict the word or phrase from the tweet that exemplifies the provided sentiment. The word or phrase should include all characters within that span (i.e. including commas, spaces, etc.)

    Columns:

    1. textID - unique ID for each piece of text
    2. text - the text of the tweet
    3. sentiment - the general sentiment of the tweet

    Acknowledgement:

    The dataset is download from Kaggle Competetions:
    https://www.kaggle.com/c/tweet-sentiment-extraction/data?select=train.csv

    Objective:

    • Understand the Dataset & cleanup (if required).
    • Build classification models to predict the twitter sentiments.
    • Compare the evaluation metrics of vaious classification algorithms.
  2. m

    Twitter Sentiments Dataset

    • data.mendeley.com
    Updated May 14, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SHERIF HUSSEIN (2021). Twitter Sentiments Dataset [Dataset]. http://doi.org/10.17632/z9zw7nt5h2.1
    Explore at:
    Dataset updated
    May 14, 2021
    Authors
    SHERIF HUSSEIN
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset has three sentiments namely, negative, neutral, and positive. It contains two fields for the tweet and label.

  3. h

    twitter-sentiment-analysis

    • huggingface.co
    Updated Aug 16, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Miguel Carlos Blanco Cacharrón (2022). twitter-sentiment-analysis [Dataset]. https://huggingface.co/datasets/carblacac/twitter-sentiment-analysis
    Explore at:
    Dataset updated
    Aug 16, 2022
    Authors
    Miguel Carlos Blanco Cacharrón
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    The Twitter Sentiment Analysis Dataset contains 1,578,627 classified tweets, each row is marked as 1 for positive sentiment and 0 for negative sentiment. The dataset is based on data from the following two sources:

    University of Michigan Sentiment Analysis competition on Kaggle Twitter Sentiment Corpus by Niek Sanders

    Finally, I randomly selected a subset of them, applied a cleaning process, and divided them between the test and train subsets, keeping a balance between the number of positive and negative tweets within each of these subsets.

  4. Indonesian Twitter Sentiment Analysis Dataset-PPKM

    • kaggle.com
    zip
    Updated Jul 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Angga Widiarta (2023). Indonesian Twitter Sentiment Analysis Dataset-PPKM [Dataset]. https://www.kaggle.com/datasets/anggapurnama/twitter-dataset-ppkm
    Explore at:
    zip(3757803 bytes)Available download formats
    Dataset updated
    Jul 31, 2023
    Authors
    Angga Widiarta
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset contains a collection of tweets from the Indonesian community, expressing their opinions on the government's implementation of PPKM (Enforcement of Community Activity Restrictions). The dataset consists of approximately 20,000 tweets gathered within the time range from April 1, 2020, to April 1, 2022.

    The selected time range for data collection is based on when Indonesia started implementing PPKM extensively and when the government revoked the policy. Within this dataset, diverse opinions, comments, and reactions from the public regarding the PPKM policy during that period can be found.

    This dataset provides an opportunity to analyze the sentiment and public views regarding the PPKM policy, as well as observe changes in opinions over time. It offers valuable insights into understanding the perceptions and reactions of the community towards government policies related to PPKM.

    Label: 0 (Positive), 1 (Neutral), 2 (Negative)

  5. h

    twitter-sentiment-analysis

    • huggingface.co
    Updated Jan 24, 2026
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Md. Abdullah Al Mamun (2026). twitter-sentiment-analysis [Dataset]. https://huggingface.co/datasets/bdstar/twitter-sentiment-analysis
    Explore at:
    Dataset updated
    Jan 24, 2026
    Authors
    Md. Abdullah Al Mamun
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    🐦 Twitter Sentiment Analysis (bdstar/twitter-sentiment-analysis)

      🧠 Overview
    

    A refined and merged version of Twitter text sentiment datasets, providing a clean and well-balanced dataset for sentiment classification across three sentiment categories:positive, negative, and neutral. This dataset is split into three parts — train, test, and validation — each sourced from highly reputable open datasets.It is designed for training, evaluating, and benchmarking NLP models for… See the full description on the dataset page: https://huggingface.co/datasets/bdstar/twitter-sentiment-analysis.

  6. Twitter Sentiment Analysis Dataset

    • kaggle.com
    zip
    Updated Jul 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Durgesh Rao (2023). Twitter Sentiment Analysis Dataset [Dataset]. https://www.kaggle.com/datasets/durgeshrao9993/twitter-analysis-dataset-2022
    Explore at:
    zip(1291530 bytes)Available download formats
    Dataset updated
    Jul 3, 2023
    Authors
    Durgesh Rao
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    The Twitter Sentiment Analysis Dataset is a widely used dataset in the field of natural language processing and sentiment analysis. It consists of a collection of tweets, each labeled with the sentiment expressed in the tweet, which can be positive, negative, or neutral. This dataset is commonly used for training and evaluating machine learning models that aim to automatically analyze and classify the sentiment behind Twitter messages.

    The dataset contains a diverse range of tweets, capturing the opinions, emotions, and attitudes of Twitter users on various topics such as movies, products, events, or general daily experiences. The tweets cover a broad spectrum of sentiments, including expressions of joy, satisfaction, anger, disappointment, sarcasm, or indifference.

  7. Brand Sentiment Analysis Dataset (Twitter)

    • kaggle.com
    zip
    Updated Jan 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tushar Paul (2024). Brand Sentiment Analysis Dataset (Twitter) [Dataset]. https://www.kaggle.com/datasets/tusharpaul2001/brand-sentiment-analysis-dataset
    Explore at:
    zip(375745 bytes)Available download formats
    Dataset updated
    Jan 7, 2024
    Authors
    Tushar Paul
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset description Users assessed tweets related to various brands and products, providing evaluations on whether the sentiment conveyed was positive, negative, or neutral. Additionally, if the tweet conveyed any sentiment, contributors identified the specific brand or product targeted by that emotion.

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F11965067%2Fa48606bfcaf80acebbb6edff7895484a%2Fdownload.png?generation=1704673111671747&alt=media" alt="">

    Train Dataset : 8589 rows x 3 columns https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F11965067%2Fe998ba81ca461699a787ff7305486b24%2FTrainDS.JPG?generation=1704672608361793&alt=media" alt="">

    Test Dataset : 504 rows x 1 columns https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F11965067%2F07df18965e91f84df123270aabb641e1%2Ftest.JPG?generation=1704679582009718&alt=media" alt="">

  8. Twitter Sentiment Analysis Datasets

    • brightdata.com
    .json, .csv, .xlsx
    Updated Jun 4, 2026
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2026). Twitter Sentiment Analysis Datasets [Dataset]. https://brightdata.com/products/datasets/twitter/sentiment-analysis
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Jun 4, 2026
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    Our Twitter Sentiment Analysis Dataset provides a comprehensive collection of tweets, enabling businesses, researchers, and analysts to assess public sentiment, track trends, and monitor brand perception in real time. This dataset includes detailed metadata for each tweet, allowing for in-depth analysis of user engagement, sentiment trends, and social media impact.

    Key Features:
    
      Tweet Content & Metadata: Includes tweet text, hashtags, mentions, media attachments, and engagement metrics such as likes, retweets, and replies.
      Sentiment Classification: Analyze sentiment polarity (positive, negative, neutral) to gauge public opinion on brands, events, and trending topics.
      Author & User Insights: Access user details such as username, profile information, follower count, and account verification status.
      Hashtag & Topic Tracking: Identify trending hashtags and keywords to monitor conversations and sentiment shifts over time.
      Engagement Metrics: Measure tweet performance based on likes, shares, and comments to evaluate audience interaction.
      Historical & Real-Time Data: Choose from historical datasets for trend analysis or real-time data for up-to-date sentiment tracking.
    
    
    Use Cases:
    
      Brand Monitoring & Reputation Management: Track public sentiment around brands, products, and services to manage reputation and customer perception.
      Market Research & Consumer Insights: Analyze consumer opinions on industry trends, competitor performance, and emerging market opportunities.
      Political & Social Sentiment Analysis: Evaluate public opinion on political events, social movements, and global issues.
      AI & Machine Learning Applications: Train sentiment analysis models for natural language processing (NLP) and predictive analytics.
      Advertising & Campaign Performance: Measure the effectiveness of marketing campaigns by analyzing audience engagement and sentiment.
    
    
    
      Our dataset is available in multiple formats (JSON, CSV, Excel) and can be delivered via API, cloud storage (AWS, Google Cloud, Azure), or direct download. 
      Gain valuable insights into social media sentiment and enhance your decision-making with high-quality, structured Twitter data.
    
  9. h

    twitter-financial-news-sentiment

    • huggingface.co
    • opendatalab.com
    Updated Dec 4, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    not a (2022). twitter-financial-news-sentiment [Dataset]. https://huggingface.co/datasets/zeroshot/twitter-financial-news-sentiment
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 4, 2022
    Authors
    not a
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Description

    The Twitter Financial News dataset is an English-language dataset containing an annotated corpus of finance-related tweets. This dataset is used to classify finance-related tweets for their sentiment.

    The dataset holds 11,932 documents annotated with 3 labels:

    sentiments = { "LABEL_0": "Bearish", "LABEL_1": "Bullish", "LABEL_2": "Neutral" }

    The data was collected using the Twitter API. The current dataset supports the multi-class classification… See the full description on the dataset page: https://huggingface.co/datasets/zeroshot/twitter-financial-news-sentiment.

  10. Sentiment Dataset with 1 Million Tweets

    • kaggle.com
    zip
    Updated Oct 11, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Muhammad Tariq (2022). Sentiment Dataset with 1 Million Tweets [Dataset]. https://www.kaggle.com/datasets/tariqsays/sentiment-dataset-with-1-million-tweets
    Explore at:
    zip(79055816 bytes)Available download formats
    Dataset updated
    Oct 11, 2022
    Authors
    Muhammad Tariq
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    This is the Sentiment dataset. The tweets have been annotated with 4 different categories(positive,negative,uncertainty,litigious) and they can be used to detect sentiment .

    Content

    It contains the following 3 fields: - Language - Text - Label

  11. t

    Twitter for Sentiment Analysis

    • t4sa.it
    Updated Oct 23, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2017). Twitter for Sentiment Analysis [Dataset]. http://www.t4sa.it/
    Explore at:
    Dataset updated
    Oct 23, 2017
    Time period covered
    Jul 2016 - Dec 2016
    Description

    3 million tweets containing both text and images

  12. h

    twitter-airline-sentiment

    • huggingface.co
    Updated Feb 24, 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Omar Sanseviero (2015). twitter-airline-sentiment [Dataset]. https://huggingface.co/datasets/osanseviero/twitter-airline-sentiment
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 24, 2015
    Authors
    Omar Sanseviero
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Dataset Card for Twitter US Airline Sentiment

      Dataset Summary
    

    This data originally came from Crowdflower's Data for Everyone library. As the original source says,

    A sentiment analysis job about the problems of each major U.S. airline. Twitter data was scraped from February of 2015 and contributors were asked to first classify positive, negative, and neutral tweets, followed by categorizing negative reasons (such as "late flight" or "rude service").

    The data we're… See the full description on the dataset page: https://huggingface.co/datasets/osanseviero/twitter-airline-sentiment.

  13. twitter sentiment analysis

    • kaggle.com
    zip
    Updated Sep 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Omar Daniel Abou Assaf Hdiefah (2024). twitter sentiment analysis [Dataset]. https://www.kaggle.com/datasets/daniel09817/twitter-sentiment-analysis
    Explore at:
    zip(42857299 bytes)Available download formats
    Dataset updated
    Sep 27, 2024
    Authors
    Omar Daniel Abou Assaf Hdiefah
    Description

    This dataset contains over 690,000 tweets labeled as Positive, Negative, or Neutral. The data can be used for sentiment analysis and natural language processing tasks. The tweets span various topics, making this a versatile dataset for training and evaluating machine learning models. The dataset was collected and labeled through. It offers a balanced distribution of sentiments to enable robust analysis

    Sentiment Distribution: Positive: 248,516 (35.9%) Negative: 244,146 (35.3%) Neutral: 198,586 (28.7%)

  14. h

    twitter-sentiment-dataset-en

    • huggingface.co
    Updated Aug 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yogi Yulianto (2023). twitter-sentiment-dataset-en [Dataset]. https://huggingface.co/datasets/yogiyulianto/twitter-sentiment-dataset-en
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 1, 2023
    Authors
    Yogi Yulianto
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    yogiyulianto/twitter-sentiment-dataset-en dataset hosted on Hugging Face and contributed by the HF Datasets community

  15. t

    Sentiment Prediction Outputs for Twitter Dataset

    • test.researchdata.tuwien.ac.at
    bin, csv, png, txt
    Updated May 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hachem Bouhamidi; Hachem Bouhamidi; Hachem Bouhamidi; Hachem Bouhamidi (2025). Sentiment Prediction Outputs for Twitter Dataset [Dataset]. http://doi.org/10.70124/c8v83-0sy11
    Explore at:
    bin, png, csv, txtAvailable download formats
    Dataset updated
    May 20, 2025
    Dataset provided by
    TU Wien
    Authors
    Hachem Bouhamidi; Hachem Bouhamidi; Hachem Bouhamidi; Hachem Bouhamidi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Apr 28, 2025
    Description

    Context and Methodology:

    This dataset was created as part of a sentiment analysis project using enriched Twitter data. The objective was to train and test a machine learning model to automatically classify the sentiment of tweets (e.g., Positive, Negative, Neutral).
    The data was generated using tweets that were sentiment-scored with a custom sentiment scorer. A machine learning pipeline was applied, including text preprocessing, feature extraction with CountVectorizer, and prediction with a HistGradientBoostingClassifier.

    Technical Details:

    The dataset includes five main files:

    • test_predictions_full.csv – Predicted sentiment labels for the test set.

    • sentiment_model.joblib – Trained machine learning model.

    • count_vectorizer.joblib – Text feature extraction model (CountVectorizer).

    • model_performance.txt – Evaluation metrics and performance report of the trained model.

    • confusion_matrix.png – Visualization of the model’s confusion matrix.

    The files follow standard naming conventions based on their purpose.
    The .joblib files can be loaded into Python using the joblib and scikit-learn libraries.
    The .csv,.txt, and .png files can be opened with any standard text reader, spreadsheet software, or image viewer.
    Additional performance documentation is included within the model_performance.txt file.

    Additional Details:

    • The data was constructed to ensure reproducibility.

    • No personal or sensitive information is present.

    • It can be reused by researchers, data scientists, and students interested in Natural Language Processing (NLP), machine learning classification, and sentiment analysis tasks.

  16. m

    Dataset for twitter Sentiment Analysis using Roberta and Vader

    • data.mendeley.com
    Updated May 14, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jannatul Ferdoshi Jannatul Ferdoshi (2023). Dataset for twitter Sentiment Analysis using Roberta and Vader [Dataset]. http://doi.org/10.17632/2sjt22sb55.1
    Explore at:
    Dataset updated
    May 14, 2023
    Authors
    Jannatul Ferdoshi Jannatul Ferdoshi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Our dataset comprises 1000 tweets, which were taken from Twitter using the Python programming language. The dataset was stored in a CSV file and generated using various modules. The random module was used to generate random IDs and text, while the faker module was used to generate random user names and dates. Additionally, the textblob module was used to assign a random sentiment to each tweet.

    This systematic approach ensures that the dataset is well-balanced and represents different types of tweets, user behavior, and sentiment. It is essential to have a balanced dataset to ensure that the analysis and visualization of the dataset are accurate and reliable. By generating tweets with a range of sentiments, we have created a diverse dataset that can be used to analyze and visualize sentiment trends and patterns.

    In addition to generating the tweets, we have also prepared a visual representation of the data sets. This visualization provides an overview of the key features of the dataset, such as the frequency distribution of the different sentiment categories, the distribution of tweets over time, and the user names associated with the tweets. This visualization will aid in the initial exploration of the dataset and enable us to identify any patterns or trends that may be present.

  17. h

    Twitter-Conversations-Sentiment-Dataset

    • huggingface.co
    Updated Sep 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DataHive AI (2025). Twitter-Conversations-Sentiment-Dataset [Dataset]. https://huggingface.co/datasets/datahiveai/Twitter-Conversations-Sentiment-Dataset
    Explore at:
    Dataset updated
    Sep 22, 2025
    Dataset authored and provided by
    DataHive AI
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Twitter Sentiment Dataset

    Sample English-only tweet sentiment dataset. Each row represents a single tweet with anonymized text and conversation structure. This is a sample dataset. To access the full version or request any custom dataset tailored to your needs, contact DataHive at contact@datahive.ai.

      Files Included
    

    dataset.csv – tweets data

      What’s included
    

    Anonymized tweet text Conversation linkage via root_id and parent_id 3-class sentiment label (positive… See the full description on the dataset page: https://huggingface.co/datasets/datahiveai/Twitter-Conversations-Sentiment-Dataset.

  18. Sentiment Analysis on Financial Tweets

    • kaggle.com
    zip
    Updated Sep 5, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vivek Rathi (2019). Sentiment Analysis on Financial Tweets [Dataset]. https://www.kaggle.com/datasets/vivekrathi055/sentiment-analysis-on-financial-tweets
    Explore at:
    zip(2538259 bytes)Available download formats
    Dataset updated
    Sep 5, 2019
    Authors
    Vivek Rathi
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Context

    The following information can also be found at https://www.kaggle.com/davidwallach/financial-tweets. Out of curosity, I just cleaned the .csv files to perform a sentiment analysis. So both the .csv files in this dataset are created by me.

    Anything you read in the description is written by David Wallach and using all this information, I happen to perform my first ever sentiment analysis.

    "I have been interested in using public sentiment and journalism to gather sentiment profiles on publicly traded companies. I first developed a Python package (https://github.com/dwallach1/Stocker) that scrapes the web for articles written about companies, and then noticed the abundance of overlap with Twitter. I then developed a NodeJS project that I have been running on my RaspberryPi to monitor Twitter for all tweets coming from those mentioned in the content section. If one of them tweeted about a company in the stocks_cleaned.csv file, then it would write the tweet to the database. Currently, the file is only from earlier today, but after about a month or two, I plan to update the tweets.csv file (hopefully closer to 50,000 entries.

    I am not quite sure how this dataset will be relevant, but I hope to use these tweets and try to generate some sense of public sentiment score."

    Content

    This dataset has all the publicly traded companies (tickers and company names) that were used as input to fill the tweets.csv. The influencers whose tweets were monitored were: ['MarketWatch', 'business', 'YahooFinance', 'TechCrunch', 'WSJ', 'Forbes', 'FT', 'TheEconomist', 'nytimes', 'Reuters', 'GerberKawasaki', 'jimcramer', 'TheStreet', 'TheStalwart', 'TruthGundlach', 'Carl_C_Icahn', 'ReformedBroker', 'benbernanke', 'bespokeinvest', 'BespokeCrypto', 'stlouisfed', 'federalreserve', 'GoldmanSachs', 'ianbremmer', 'MorganStanley', 'AswathDamodaran', 'mcuban', 'muddywatersre', 'StockTwits', 'SeanaNSmith'

    Acknowledgements

    The data used here is gathered from a project I developed : https://github.com/dwallach1/StockerBot

    Inspiration

    I hope to develop a financial sentiment text classifier that would be able to track Twitter's (and the entire public's) feelings about any publicly traded company (and cryptocurrency)

  19. h

    tweet_sentiment_multilingual

    • huggingface.co
    Updated Oct 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Massive Text Embedding Benchmark (2022). tweet_sentiment_multilingual [Dataset]. https://huggingface.co/datasets/mteb/tweet_sentiment_multilingual
    Explore at:
    Dataset updated
    Oct 13, 2022
    Dataset authored and provided by
    Massive Text Embedding Benchmark
    License

    Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
    License information was derived automatically

    Description

    TweetSentimentClassification An MTEB dataset Massive Text Embedding Benchmark

    A multilingual Sentiment Analysis dataset consisting of tweets in 8 different languages.

    Task category t2c

    Domains Social, Written

    Referencehttps://aclanthology.org/2022.lrec-1.27

      How to evaluate on this task
    

    You can evaluate an embedding model on this dataset using the following code: import mteb

    task = mteb.get_tasks(["TweetSentimentClassification"]) evaluator =… See the full description on the dataset page: https://huggingface.co/datasets/mteb/tweet_sentiment_multilingual.

  20. Synthetic Twitter Sentiment Analysis

    • kaggle.com
    zip
    Updated Jan 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    smmmmmmmmmmmm (2024). Synthetic Twitter Sentiment Analysis [Dataset]. https://www.kaggle.com/datasets/smmmmmmmmmmmm/synthetic-twitter-sentiment-analysis
    Explore at:
    zip(225220 bytes)Available download formats
    Dataset updated
    Jan 15, 2024
    Authors
    smmmmmmmmmmmm
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    The dataset includes key attributes such as Tweet ID, Username, Tweet Text, Retweets, Favorites, Followers, Timestamp, and Sentiment. Each entry is generated using the Faker library, ensuring the diversity and randomness of the data while preserving the structural integrity of real-world Twitter data.

    The 'Sentiment' column represents the emotional tone associated with each tweet and is categorized into 'Positive', 'Negative', or 'Neutral' sentiments. This categorization is randomly assigned, emulating the dynamic nature of sentiment in social media content.

    Researchers, data scientists, and machine learning practitioners can leverage this synthetic dataset to develop and test sentiment analysis models, explore feature engineering techniques, and evaluate the performance of algorithms in a controlled environment. The dataset serves as a valuable resource for honing natural language processing skills and gaining insights into sentiment trends in social media data. While synthetic, it mirrors the complexities and nuances found in real Twitter data, providing a foundation for robust sentiment analysis research and experimentation.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
M Yasser H (2022). Twitter Tweets Sentiment Dataset [Dataset]. https://www.kaggle.com/datasets/yasserh/twitter-tweets-sentiment-dataset
Organization logo

Twitter Tweets Sentiment Dataset

Twitter Tweets Sentiment Analysis for Natural Language Processing

Explore at:
43 scholarly articles cite this dataset (View in Google Scholar)
zip(1289519 bytes)Available download formats
Dataset updated
Apr 8, 2022
Authors
M Yasser H
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

https://raw.githubusercontent.com/Masterx-AI/Project_Twitter_Sentiment_Analysis_/main/twitt.jpg" alt="">

Description:

Twitter is an online Social Media Platform where people share their their though as tweets. It is observed that some people misuse it to tweet hateful content. Twitter is trying to tackle this problem and we shall help it by creating a strong NLP based-classifier model to distinguish the negative tweets & block such tweets. Can you build a strong classifier model to predict the same?

Each row contains the text of a tweet and a sentiment label. In the training set you are provided with a word or phrase drawn from the tweet (selected_text) that encapsulates the provided sentiment.

Make sure, when parsing the CSV, to remove the beginning / ending quotes from the text field, to ensure that you don't include them in your training.

You're attempting to predict the word or phrase from the tweet that exemplifies the provided sentiment. The word or phrase should include all characters within that span (i.e. including commas, spaces, etc.)

Columns:

  1. textID - unique ID for each piece of text
  2. text - the text of the tweet
  3. sentiment - the general sentiment of the tweet

Acknowledgement:

The dataset is download from Kaggle Competetions:
https://www.kaggle.com/c/tweet-sentiment-extraction/data?select=train.csv

Objective:

  • Understand the Dataset & cleanup (if required).
  • Build classification models to predict the twitter sentiments.
  • Compare the evaluation metrics of vaious classification algorithms.
Search
Clear search
Close search
Google apps
Main menu