100+ datasets found
  1. i

    Twitter Sentiment Analysis Data

    • ieee-dataport.org
    Updated Aug 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rabindra Lamsal (2024). Twitter Sentiment Analysis Data [Dataset]. https://ieee-dataport.org/documents/twitter-sentiment-analysis-data
    Explore at:
    Dataset updated
    Aug 6, 2024
    Authors
    Rabindra Lamsal
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    because of COVID-19

  2. m

    Twitter Sentiments Dataset

    • data.mendeley.com
    Updated May 14, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SHERIF HUSSEIN (2021). Twitter Sentiments Dataset [Dataset]. http://doi.org/10.17632/z9zw7nt5h2.1
    Explore at:
    Dataset updated
    May 14, 2021
    Authors
    SHERIF HUSSEIN
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset has three sentiments namely, negative, neutral, and positive. It contains two fields for the tweet and label.

  3. Sentiment Analysis on Financial Tweets

    • kaggle.com
    zip
    Updated Sep 5, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vivek Rathi (2019). Sentiment Analysis on Financial Tweets [Dataset]. https://www.kaggle.com/datasets/vivekrathi055/sentiment-analysis-on-financial-tweets
    Explore at:
    zip(2538259 bytes)Available download formats
    Dataset updated
    Sep 5, 2019
    Authors
    Vivek Rathi
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Context

    The following information can also be found at https://www.kaggle.com/davidwallach/financial-tweets. Out of curosity, I just cleaned the .csv files to perform a sentiment analysis. So both the .csv files in this dataset are created by me.

    Anything you read in the description is written by David Wallach and using all this information, I happen to perform my first ever sentiment analysis.

    "I have been interested in using public sentiment and journalism to gather sentiment profiles on publicly traded companies. I first developed a Python package (https://github.com/dwallach1/Stocker) that scrapes the web for articles written about companies, and then noticed the abundance of overlap with Twitter. I then developed a NodeJS project that I have been running on my RaspberryPi to monitor Twitter for all tweets coming from those mentioned in the content section. If one of them tweeted about a company in the stocks_cleaned.csv file, then it would write the tweet to the database. Currently, the file is only from earlier today, but after about a month or two, I plan to update the tweets.csv file (hopefully closer to 50,000 entries.

    I am not quite sure how this dataset will be relevant, but I hope to use these tweets and try to generate some sense of public sentiment score."

    Content

    This dataset has all the publicly traded companies (tickers and company names) that were used as input to fill the tweets.csv. The influencers whose tweets were monitored were: ['MarketWatch', 'business', 'YahooFinance', 'TechCrunch', 'WSJ', 'Forbes', 'FT', 'TheEconomist', 'nytimes', 'Reuters', 'GerberKawasaki', 'jimcramer', 'TheStreet', 'TheStalwart', 'TruthGundlach', 'Carl_C_Icahn', 'ReformedBroker', 'benbernanke', 'bespokeinvest', 'BespokeCrypto', 'stlouisfed', 'federalreserve', 'GoldmanSachs', 'ianbremmer', 'MorganStanley', 'AswathDamodaran', 'mcuban', 'muddywatersre', 'StockTwits', 'SeanaNSmith'

    Acknowledgements

    The data used here is gathered from a project I developed : https://github.com/dwallach1/StockerBot

    Inspiration

    I hope to develop a financial sentiment text classifier that would be able to track Twitter's (and the entire public's) feelings about any publicly traded company (and cryptocurrency)

  4. b

    Twitter Sentiment Analysis Datasets

    • brightdata.com
    .json, .csv, .xlsx
    Updated Dec 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2024). Twitter Sentiment Analysis Datasets [Dataset]. https://brightdata.com/products/datasets/twitter/sentiment-analysis
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Dec 24, 2024
    Dataset authored and provided by
    Bright Data
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    Our Twitter Sentiment Analysis Dataset provides a comprehensive collection of tweets, enabling businesses, researchers, and analysts to assess public sentiment, track trends, and monitor brand perception in real time. This dataset includes detailed metadata for each tweet, allowing for in-depth analysis of user engagement, sentiment trends, and social media impact.

    Key Features:
    
      Tweet Content & Metadata: Includes tweet text, hashtags, mentions, media attachments, and engagement metrics such as likes, retweets, and replies.
      Sentiment Classification: Analyze sentiment polarity (positive, negative, neutral) to gauge public opinion on brands, events, and trending topics.
      Author & User Insights: Access user details such as username, profile information, follower count, and account verification status.
      Hashtag & Topic Tracking: Identify trending hashtags and keywords to monitor conversations and sentiment shifts over time.
      Engagement Metrics: Measure tweet performance based on likes, shares, and comments to evaluate audience interaction.
      Historical & Real-Time Data: Choose from historical datasets for trend analysis or real-time data for up-to-date sentiment tracking.
    
    
    Use Cases:
    
      Brand Monitoring & Reputation Management: Track public sentiment around brands, products, and services to manage reputation and customer perception.
      Market Research & Consumer Insights: Analyze consumer opinions on industry trends, competitor performance, and emerging market opportunities.
      Political & Social Sentiment Analysis: Evaluate public opinion on political events, social movements, and global issues.
      AI & Machine Learning Applications: Train sentiment analysis models for natural language processing (NLP) and predictive analytics.
      Advertising & Campaign Performance: Measure the effectiveness of marketing campaigns by analyzing audience engagement and sentiment.
    
    
    
      Our dataset is available in multiple formats (JSON, CSV, Excel) and can be delivered via API, cloud storage (AWS, Google Cloud, Azure), or direct download. 
      Gain valuable insights into social media sentiment and enhance your decision-making with high-quality, structured Twitter data.
    
  5. h

    large-twitter-tweets-sentiment

    • huggingface.co
    Updated Mar 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gong Xiangbo (2024). large-twitter-tweets-sentiment [Dataset]. https://huggingface.co/datasets/gxb912/large-twitter-tweets-sentiment
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 6, 2024
    Authors
    Gong Xiangbo
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Card for "Large twitter tweets sentiment analysis"

      Dataset Description
    
    
    
    
    
      Dataset Summary
    

    This dataset is a collection of tweets formatted in a tabular data structure, annotated for sentiment analysis. Each tweet is associated with a sentiment label, with 1 indicating a Positive sentiment and 0 for a Negative sentiment.

      Languages
    

    The tweets in English.

      Dataset Structure
    
    
    
    
    
      Data Instances
    

    An instance of the dataset includes
 See the full description on the dataset page: https://huggingface.co/datasets/gxb912/large-twitter-tweets-sentiment.

  6. Z

    Brussel mobility Twitter sentiment analysis CSV Dataset

    • data.niaid.nih.gov
    Updated May 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tori, Floriano (2024). Brussel mobility Twitter sentiment analysis CSV Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_11401123
    Explore at:
    Dataset updated
    May 31, 2024
    Dataset provided by
    Tori, Floriano
    Ginis, Vincent
    van Vessem, Charlotte
    Betancur Arenas, Juliana
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Brussels
    Description

    SSH CENTRE (Social Sciences and Humanities for Climate, Energy aNd Transport Research Excellence) is a Horizon Europe project, engaging directly with stakeholders across research, policy, and business (including citizens) to strengthen social innovation, SSH-STEM collaboration, transdisciplinary policy advice, inclusive engagement, and SSH communities across Europe, accelerating the EU’s transition to carbon neutrality. SSH CENTRE is based in a range of activities related to Open Science, inclusivity and diversity – especially with regards Southern and Eastern Europe and different career stages – including: development of novel SSH-STEM collaborations to facilitate the delivery of the EU Green Deal; SSH knowledge brokerage to support regions in transition; and the effective design of strategies for citizen engagement in EU R&I activities. Outputs include action-led agendas and building stakeholder synergies through regular Policy Insight events.This is captured in a high-profile virtual SSH CENTRE generating and sharing best practice for SSH policy advice, overcoming fragmentation to accelerate the EU’s journey to a sustainable future.The documents uploaded here are part of WP2 whereby novel, interdisciplinary teams were provided funding to undertake activities to develop a policy recommendation related to EU Green Deal policy. Each of these policy recommendations, and the activities that inform them, will be written-up as a chapter in an edited book collection. Three books will make up this edited collection - one on climate, one on energy and one on mobility. As part of writing a chapter for the SSH CENTRE book on ‘Mobility’, we set out to analyse the sentiment of users on Twitter regarding shared and active mobility modes in Brussels. This involved us collecting tweets between 2017-2022. A tweet was collected if it contained a previously defined mobility keyword (for example: metro) and either the name of a (local) politician, a neighbourhood or municipality, or a (shared) mobility provider. The files attached to this Zenodo webpage is a csv files containing the tweets collected.”.

  7. h

    tweet_sentiment_multilingual

    • huggingface.co
    • opendatalab.com
    Updated Dec 25, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cardiff NLP (2022). tweet_sentiment_multilingual [Dataset]. https://huggingface.co/datasets/cardiffnlp/tweet_sentiment_multilingual
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 25, 2022
    Dataset authored and provided by
    Cardiff NLP
    Description

    Dataset Card for cardiffnlp/tweet_sentiment_multilingual

      Dataset Summary
    

    Tweet Sentiment Multilingual consists of sentiment analysis dataset on Twitter in 8 different lagnuages.

    arabic english french german hindi italian portuguese spanish

      Supported Tasks and Leaderboards
    

    text_classification: The dataset can be trained using a SentenceClassification model from HuggingFace transformers.

      Dataset Structure
    
    
    
    
    
      Data Instances
    

    An instance from
 See the full description on the dataset page: https://huggingface.co/datasets/cardiffnlp/tweet_sentiment_multilingual.

  8. h

    financial-tweets-sentiment

    • huggingface.co
    Updated Dec 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tim Koornstra (2023). financial-tweets-sentiment [Dataset]. https://huggingface.co/datasets/TimKoornstra/financial-tweets-sentiment
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 15, 2023
    Authors
    Tim Koornstra
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Financial Sentiment Analysis Dataset

      Overview
    

    This dataset is a comprehensive collection of tweets focused on financial topics, meticulously curated to assist in sentiment analysis in the domain of finance and stock markets. It serves as a valuable resource for training machine learning models to understand and predict sentiment trends based on social media discourse, particularly within the financial sector.

      Data Description
    

    The dataset comprises tweets
 See the full description on the dataset page: https://huggingface.co/datasets/TimKoornstra/financial-tweets-sentiment.

  9. c

    Twitter Tweets Sentiment Dataset

    • cubig.ai
    Updated Feb 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CUBIG (2025). Twitter Tweets Sentiment Dataset [Dataset]. https://cubig.ai/store/products/142/twitter-tweets-sentiment-dataset
    Explore at:
    Dataset updated
    Feb 25, 2025
    Dataset authored and provided by
    CUBIG
    License

    https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

    Measurement technique
    Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
    Description

    1) Data introduction ‱ Twitter-tweets-sentiment dataset is a dataset that aims to analyze tweet sentiment for Twitter and natural language processing.

    2) Data utilization (1)Twitter-tweets-sentiment data has characteristics that: ‱ The data consists of three columns, including emotion and text, and aims to block negative tweets through a powerful classification model. (2) Twitter-tweets-sentiment data can be used to: ‱ Social Media Monitoring: Businesses and organizations can use data to monitor social media platforms and gauge public sentiment about a brand, product, event, or social issue. ‱ Sentiment analysis: This dataset can be used to train models that classify the sentiment of tweets, which can help companies and researchers understand public opinion on a variety of topics.

  10. m

    The Climate Change Twitter Dataset

    • data.mendeley.com
    Updated May 19, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dimitrios Effrosynidis (2022). The Climate Change Twitter Dataset [Dataset]. http://doi.org/10.17632/mw8yd7z9wc.2
    Explore at:
    Dataset updated
    May 19, 2022
    Authors
    Dimitrios Effrosynidis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    If you use the dataset, cite the paper: https://doi.org/10.1016/j.eswa.2022.117541

    The most comprehensive dataset to date regarding climate change and human opinions via Twitter. It has the heftiest temporal coverage, spanning over 13 years, includes over 15 million tweets spatially distributed across the world, and provides the geolocation of most tweets. Seven dimensions of information are tied to each tweet, namely geolocation, user gender, climate change stance and sentiment, aggressiveness, deviations from historic temperature, and topic modeling, while accompanied by environmental disaster events information. These dimensions were produced by testing and evaluating a plethora of state-of-the-art machine learning algorithms and methods, both supervised and unsupervised, including BERT, RNN, LSTM, CNN, SVM, Naive Bayes, VADER, Textblob, Flair, and LDA.

    The following columns are in the dataset:

    ➡ created_at: The timestamp of the tweet. ➡ id: The unique id of the tweet. ➡ lng: The longitude the tweet was written. ➡ lat: The latitude the tweet was written. ➡ topic: Categorization of the tweet in one of ten topics namely, seriousness of gas emissions, importance of human intervention, global stance, significance of pollution awareness events, weather extremes, impact of resource overconsumption, Donald Trump versus science, ideological positions on global warming, politics, and undefined. ➡ sentiment: A score on a continuous scale. This scale ranges from -1 to 1 with values closer to 1 being translated to positive sentiment, values closer to -1 representing a negative sentiment while values close to 0 depicting no sentiment or being neutral. ➡ stance: That is if the tweet supports the belief of man-made climate change (believer), if the tweet does not believe in man-made climate change (denier), and if the tweet neither supports nor refuses the belief of man-made climate change (neutral). ➡ gender: Whether the user that made the tweet is male, female, or undefined. ➡ temperature_avg: The temperature deviation in Celsius and relative to the January 1951-December 1980 average at the time and place the tweet was written. ➡ aggressiveness: That is if the tweet contains aggressive language or not.

    Since Twitter forbids making public the text of the tweets, in order to retrieve it you need to do a process called hydrating. Tools such as Twarc or Hydrator can be used to hydrate tweets.

  11. Covid Twitter Sentiment Analysis Datasets

    • kaggle.com
    zip
    Updated Jan 7, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MEJBAH AHAMMAD (2021). Covid Twitter Sentiment Analysis Datasets [Dataset]. https://www.kaggle.com/mejbahahammad/covid-twitter-sentiment-analysis-datasets
    Explore at:
    zip(111387463 bytes)Available download formats
    Dataset updated
    Jan 7, 2021
    Authors
    MEJBAH AHAMMAD
    Description

    This dataset gives a cursory glimpse at the overall sentiment trend of the public discourse regarding the COVID-19 pandemic on Twitter. The live scatter plot of this dataset is available as The Overall Trend block at https://live.rlamsal.com.np. The trend graph reveals multiple peaks and drops that need further analysis. The n-grams during those peaks and drops can prove beneficial for better understanding the discourse. The dataset will be updated weekly and will continue until the development of the Coronavirus (COVID-19) Tweets Dataset is ongoing.

  12. Z

    IA Tweets Analysis Dataset (Spanish)

    • data.niaid.nih.gov
    • produccioncientifica.uca.es
    • +1more
    Updated Aug 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Muñoz, Andrés (2024). IA Tweets Analysis Dataset (Spanish) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10821484
    Explore at:
    Dataset updated
    Aug 3, 2024
    Dataset provided by
    Muñoz, Andrés
    Guerrero-Contreras, Gabriel
    Balderas-DĂ­az, Sara
    Serrano-FernĂĄndez, Alejandro
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    General Description

    This dataset comprises 4,038 tweets in Spanish, related to discussions about artificial intelligence (AI), and was created and utilized in the publication "Enhancing Sentiment Analysis on Social Media: Integrating Text and Metadata for Refined Insights," (10.1109/IE61493.2024.10599899) presented at the 20th International Conference on Intelligent Environments. It is designed to support research on public perception, sentiment, and engagement with AI topics on social media from a Spanish-speaking perspective. Each entry includes detailed annotations covering sentiment analysis, user engagement metrics, and user profile characteristics, among others.

    Data Collection Method

    Tweets were gathered through the Twitter API v1.1 by targeting keywords and hashtags associated with artificial intelligence, focusing specifically on content in Spanish. The dataset captures a wide array of discussions, offering a holistic view of the Spanish-speaking public's sentiment towards AI.

    Dataset Content

    ID: A unique identifier for each tweet.

    text: The textual content of the tweet. It is a string with a maximum allowed length of 280 characters.

    polarity: The tweet's sentiment polarity (e.g., Positive, Negative, Neutral).

    favorite_count: Indicates how many times the tweet has been liked by Twitter users. It is a non-negative integer.

    retweet_count: The number of times this tweet has been retweeted. It is a non-negative integer.

    user_verified: When true, indicates that the user has a verified account, which helps the public recognize the authenticity of accounts of public interest. It is a boolean data type with two allowed values: True or False.

    user_default_profile: When true, indicates that the user has not altered the theme or background of their user profile. It is a boolean data type with two allowed values: True or False.

    user_has_extended_profile: When true, indicates that the user has an extended profile. An extended profile on Twitter allows users to provide more detailed information about themselves, such as an extended biography, a header image, details about their location, website, and other additional data. It is a boolean data type with two allowed values: True or False.

    user_followers_count: The current number of followers the account has. It is a non-negative integer.

    user_friends_count: The number of users that the account is following. It is a non-negative integer.

    user_favourites_count: The number of tweets this user has liked since the account was created. It is a non-negative integer.

    user_statuses_count: The number of tweets (including retweets) posted by the user. It is a non-negative integer.

    user_protected: When true, indicates that this user has chosen to protect their tweets, meaning their tweets are not publicly visible without their permission. It is a boolean data type with two allowed values: True or False.

    user_is_translator: When true, indicates that the user posting the tweet is a verified translator on Twitter. This means they have been recognized and validated by the platform as translators of content in different languages. It is a boolean data type with two allowed values: True or False.

    Cite as

    Guerrero-Contreras, G., Balderas-Díaz, S., Serrano-Fernåndez, A., & Muñoz, A. (2024, June). Enhancing Sentiment Analysis on Social Media: Integrating Text and Metadata for Refined Insights. In 2024 International Conference on Intelligent Environments (IE) (pp. 62-69). IEEE.

    Potential Use Cases

    This dataset is aimed at academic researchers and practitioners with interests in:

    Sentiment analysis and natural language processing (NLP) with a focus on AI discussions in the Spanish language.

    Social media analysis on public engagement and perception of artificial intelligence among Spanish speakers.

    Exploring correlations between user engagement metrics and sentiment in discussions about AI.

    Data Format and File Type

    The dataset is provided in CSV format, ensuring compatibility with a wide range of data analysis tools and programming environments.

    License

    The dataset is available under the Creative Commons Attribution 4.0 International (CC BY 4.0) license, permitting sharing, copying, distribution, transmission, and adaptation of the work for any purpose, including commercial, provided proper attribution is given.

  13. t

    Sentiment Prediction Outputs for Twitter Dataset

    • test.researchdata.tuwien.at
    bin, csv, png, txt
    Updated May 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hachem Bouhamidi; Hachem Bouhamidi; Hachem Bouhamidi; Hachem Bouhamidi (2025). Sentiment Prediction Outputs for Twitter Dataset [Dataset]. http://doi.org/10.70124/c8v83-0sy11
    Explore at:
    bin, csv, png, txtAvailable download formats
    Dataset updated
    May 20, 2025
    Dataset provided by
    TU Wien
    Authors
    Hachem Bouhamidi; Hachem Bouhamidi; Hachem Bouhamidi; Hachem Bouhamidi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Apr 28, 2025
    Description

    Context and Methodology:

    This dataset was created as part of a sentiment analysis project using enriched Twitter data. The objective was to train and test a machine learning model to automatically classify the sentiment of tweets (e.g., Positive, Negative, Neutral).
    The data was generated using tweets that were sentiment-scored with a custom sentiment scorer. A machine learning pipeline was applied, including text preprocessing, feature extraction with CountVectorizer, and prediction with a HistGradientBoostingClassifier.

    Technical Details:

    The dataset includes five main files:

    • test_predictions_full.csv – Predicted sentiment labels for the test set.

    • sentiment_model.joblib – Trained machine learning model.

    • count_vectorizer.joblib – Text feature extraction model (CountVectorizer).

    • model_performance.txt – Evaluation metrics and performance report of the trained model.

    • confusion_matrix.png – Visualization of the model’s confusion matrix.

    The files follow standard naming conventions based on their purpose.
    The .joblib files can be loaded into Python using the joblib and scikit-learn libraries.
    The .csv,.txt, and .png files can be opened with any standard text reader, spreadsheet software, or image viewer.
    Additional performance documentation is included within the model_performance.txt file.

    Additional Details:

    • The data was constructed to ensure reproducibility.

    • No personal or sensitive information is present.

    • It can be reused by researchers, data scientists, and students interested in Natural Language Processing (NLP), machine learning classification, and sentiment analysis tasks.

  14. t

    Sentiment Analysis of Enhanced Twitter Data with Custom Sentiment Scoring

    • test.dbrepo.tuwien.ac.at
    Updated Apr 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bouhamidi, Hachem (2025). Sentiment Analysis of Enhanced Twitter Data with Custom Sentiment Scoring [Dataset]. http://doi.org/10.82556/9rzx-7r26
    Explore at:
    Dataset updated
    Apr 29, 2025
    Authors
    Bouhamidi, Hachem
    Time period covered
    2025
    Description

    This database contains training, validation, and test sets created for a Twitter sentiment classification project. The tweets were cleaned and improved with custom-calculated sentiment scores and magnitudes using a word-weighted dictionary. The data is split to support machine learning experiments.

  15. Z

    Data mining and sentiment analysis on Twitter and Facebook

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Halili, Admira (2023). Data mining and sentiment analysis on Twitter and Facebook [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7979377
    Explore at:
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Chevalley, Coralie
    Covaleda, Felipe
    Tavares Da Costa, Erisvelton
    Halili, Admira
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Les données récoltées sont sur le sujet "Data mining and sentiment analysis on Twitter and Facebook". Ce jeu de donnée contient la liste des attributs principaux suivants :

    titles, titre du fichier PDF,

    authors, auteurs du fichier PDF,

    years, année de création du fichier PDF,

    ncitedby, nombre de citation,

    linkfiles, liens du fichier PDF,

    mais également des métadonnées.

    La récupération du jeu de données a été récolté sur Google Scholar. Plusieurs recherches sur Google Scholar ont été faites pour ce dernier (voir liens ci-dessous) :

    https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=twitter+data+mining+filetype%3Apdf&btnG=

    https://scholar.google.com/scholar?start=490&q=facebook+data+mining+-Twitter+filetype:pdf&hl=en&as_sdt=0,5

    https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=seniment+analyse+twitter+filetype%3Apdf&btnG=

  16. Z

    Data from: Arabic news credibility on Twitter using sentiment analysis and...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Almani, Nada (2023). Arabic news credibility on Twitter using sentiment analysis and ensemble learning [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8000716
    Explore at:
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    Samdani, Duha
    Almani, Nada
    Taileb, Mounira
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Arabic news credibility on Twitter using sentiment analysis and ensemble learning.

    WHAT IS IT?

    an Arabic news credibility model on Twitter using sentiment analysis and ensemble learning.

    Here we include the Collected dataset and the source code of the proposed model written in Python language and using Keras library with Tensorflow backend.

    Required Packages

    Keras (https://keras.io/).

    Scikit-learn (http://scikit-learn.org/)

    Imnlearn (imbalanced-learn documentation — Version 0.10.1)

    To Run the model

    One data file is required to run the model which are:

    The data that were used are the collected dataset in the file, set the path of the required data file in the code.

    The dataset

    There are the dataset file with all features, you can choose the features that you need and apply it on the model.

    There are a description file that describe each feature in the news credibility dataset

    The file Tweet_ID contains the list of tweets id in the dataset.

    The annotated replies based on credibility is provided.

    CONTACTS

    If you want to report bugs or have general queries email to

  17. TM-Senti

    • figshare.com
    bz2
    Updated Aug 25, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wenjie Yin; Rabab Alkhalifa; Arkaitz Zubiaga (2021). TM-Senti [Dataset]. http://doi.org/10.6084/m9.figshare.16438281.v1
    Explore at:
    bz2Available download formats
    Dataset updated
    Aug 25, 2021
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Wenjie Yin; Rabab Alkhalifa; Arkaitz Zubiaga
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a large-scale, multilingual and longitudinal Twitter sentiment dataset sampled through distant supervision from the Twitter Stream Grab archive (https://archive.org/details/twitterstream). It covers the time period between January 2013 and June 2020 for 7 languages:- Arabic (ar)- German (de)- English (en)- Spanish (es)- French (fr)- Italian (it)- Chinese (zh)With the files in this repository, we provide tweet IDs that can be used to rehydrate the datasets by using the files available from the Twitter Stream Grab.Files are formatted as TSV files, with the following columns:date \t tweetid \t sentiment \t evidencewhere:- date is the day in which the tweet was posted.- tweetid is the ID of the tweet- sentiment is either pos or neg- evidence is the set of emojis or emoticons used to determine if the tweet was positive or negative.More details about the dataset can be found in the following paper (please cite the paper if you use the dataset):TBA

  18. TRACES Sentiment Analysis Twitter Dataset

    • zenodo.org
    Updated Oct 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Irina Temnikova; Irina Temnikova; Silvia Gargova; Silvia Gargova (2023). TRACES Sentiment Analysis Twitter Dataset [Dataset]. http://doi.org/10.5281/zenodo.7357386
    Explore at:
    Dataset updated
    Oct 10, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Irina Temnikova; Irina Temnikova; Silvia Gargova; Silvia Gargova
    Description

    This dataset has been created within Project TRACES (more information: https://traces.gate-ai.eu/). The dataset contains 1810 unique tweet IDs, written in Bulgarian, with annotations (positive, negative, neutral). The tweets are on the topics of lies, manipulation, and Covid-19 and are a subset of the following datasets:

    https://zenodo.org/record/7296865

    https://zenodo.org/record/7296736

    https://zenodo.org/record/7296877

    The tweets have been collected via Twitter API under academic access between 1 Jan 2020 - 28 June 2022 and thus cannot be used for commercial purposes.

  19. h

    twitter-financial-news-sentiment

    • huggingface.co
    • opendatalab.com
    Updated Dec 4, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    not a (2022). twitter-financial-news-sentiment [Dataset]. https://huggingface.co/datasets/zeroshot/twitter-financial-news-sentiment
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 4, 2022
    Authors
    not a
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Description

    The Twitter Financial News dataset is an English-language dataset containing an annotated corpus of finance-related tweets. This dataset is used to classify finance-related tweets for their sentiment.

    The dataset holds 11,932 documents annotated with 3 labels:

    sentiments = { "LABEL_0": "Bearish", "LABEL_1": "Bullish", "LABEL_2": "Neutral" }

    The data was collected using the Twitter API. The current dataset supports the multi-class classification
 See the full description on the dataset page: https://huggingface.co/datasets/zeroshot/twitter-financial-news-sentiment.

  20. Twitter tweets data

    • kaggle.com
    Updated Mar 31, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nitin G (2019). Twitter tweets data [Dataset]. https://www.kaggle.com/nitin194/twitter-sentiment-analysis/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 31, 2019
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Nitin G
    Description

    Dataset

    This dataset was created by Nitin G

    Contents

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Rabindra Lamsal (2024). Twitter Sentiment Analysis Data [Dataset]. https://ieee-dataport.org/documents/twitter-sentiment-analysis-data

Twitter Sentiment Analysis Data

Explore at:
208 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Aug 6, 2024
Authors
Rabindra Lamsal
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

because of COVID-19

Search
Clear search
Close search
Google apps
Main menu