49 datasets found
  1. Customer Support Twitter Data

    • kaggle.com
    zip
    Updated Aug 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Muhammad Asif (2025). Customer Support Twitter Data [Dataset]. https://www.kaggle.com/datasets/muhammadasif786/customer-support-twitter-data
    Explore at:
    zip(176765850 bytes)Available download formats
    Dataset updated
    Aug 29, 2025
    Authors
    Muhammad Asif
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Muhammad Asif

    Released under Apache 2.0

    Contents

  2. Customer Support on Twitter

    • kaggle.com
    zip
    Updated Oct 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amin Aslami (2024). Customer Support on Twitter [Dataset]. https://www.kaggle.com/datasets/aminaslam/customer-support-on-twitter
    Explore at:
    zip(78948 bytes)Available download formats
    Dataset updated
    Oct 17, 2024
    Authors
    Amin Aslami
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Amin Aslami

    Released under Apache 2.0

    Contents

  3. Customer Support on Twitter

    • kaggle.com
    zip
    Updated Dec 3, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thought Vector (2017). Customer Support on Twitter [Dataset]. https://www.kaggle.com/dsv/8841
    Explore at:
    zip(176772673 bytes)Available download formats
    Dataset updated
    Dec 3, 2017
    Dataset authored and provided by
    Thought Vector
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    The Customer Support on Twitter dataset is a large, modern corpus of tweets and replies to aid innovation in natural language understanding and conversational models, and for study of modern customer support practices and impact.

    https://i.imgur.com/nTv3Iuu.png" alt="Example Analysis - Inbound Volume for the Top 20 Brands">

    Context

    Natural language remains the densest encoding of human experience we have, and innovation in NLP has accelerated to power understanding of that data, but the datasets driving this innovation don't match the real language in use today. The Customer Support on Twitter dataset offers a large corpus of modern English (mostly) conversations between consumers and customer support agents on Twitter, and has three important advantages over other conversational text datasets:

    • Focused - Consumers contact customer support to have a specific problem solved, and the manifold of problems to be discussed is relatively small, especially compared to unconstrained conversational datasets like the reddit Corpus.
    • Natural - Consumers in this dataset come from a much broader segment than those in the Ubuntu Dialogue Corpus and have much more natural and recent use of typed text than the Cornell Movie Dialogs Corpus.
    • Succinct - Twitter's brevity causes more natural responses from support agents (rather than scripted), and to-the-point descriptions of problems and solutions. Also, its convenient in allowing for a relatively low message limit size for recurrent nets.

    Inspiration

    The size and breadth of this dataset inspires many interesting questions:

    • Can we predict company responses? Given the bounded set of subjects handled by each company, the answer seems like yes!
    • Do requests get stale? How quickly do the best companies respond, compared to the worst?
    • Can we learn high quality dense embeddings or representations of similarity for topical clustering?
    • How does tone affect the customer support conversation? Does saying sorry help?
    • Can we help companies identify new problems, or ones most affecting their customers?

    Acknowledgements

    Dataset built with PointScrape.

    Content

    The dataset is a CSV, where each row is a tweet. The different columns are described below. Every conversation included has at least one request from a consumer and at least one response from a company. Which user IDs are company user IDs can be calculated using the inbound field.

    tweet_id

    A unique, anonymized ID for the Tweet. Referenced by response_tweet_id and in_response_to_tweet_id.

    author_id

    A unique, anonymized user ID. @s in the dataset have been replaced with their associated anonymized user ID.

    inbound

    Whether the tweet is "inbound" to a company doing customer support on Twitter. This feature is useful when re-organizing data for training conversational models.

    created_at

    Date and time when the tweet was sent.

    text

    Tweet content. Sensitive information like phone numbers and email addresses are replaced with mask values like _email_.

    response_tweet_id

    IDs of tweets that are responses to this tweet, comma-separated.

    in_response_to_tweet_id

    ID of the tweet this tweet is in response to, if any.

    Contributing

    Know of other brands the dataset should include? Found something that needs to be fixed? Start a discussion, or email me directly at $FIRSTNAME@$LASTNAME.com!

    Acknowledgements

    A huge thank you to my friends who helped bootstrap the list of companies that do customer support on Twitter! There are many rocks that would have been left un-turned were it not for your suggestions!

    Relevant Resources

    Licensing

    For commercial applications and use of full dataset, please contact stuart@thoughtvector.io.

  4. Twitter Tweets Sentiment Dataset

    • kaggle.com
    zip
    Updated Apr 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    M Yasser H (2022). Twitter Tweets Sentiment Dataset [Dataset]. https://www.kaggle.com/datasets/yasserh/twitter-tweets-sentiment-dataset
    Explore at:
    zip(1289519 bytes)Available download formats
    Dataset updated
    Apr 8, 2022
    Authors
    M Yasser H
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    https://raw.githubusercontent.com/Masterx-AI/Project_Twitter_Sentiment_Analysis_/main/twitt.jpg" alt="">

    Description:

    Twitter is an online Social Media Platform where people share their their though as tweets. It is observed that some people misuse it to tweet hateful content. Twitter is trying to tackle this problem and we shall help it by creating a strong NLP based-classifier model to distinguish the negative tweets & block such tweets. Can you build a strong classifier model to predict the same?

    Each row contains the text of a tweet and a sentiment label. In the training set you are provided with a word or phrase drawn from the tweet (selected_text) that encapsulates the provided sentiment.

    Make sure, when parsing the CSV, to remove the beginning / ending quotes from the text field, to ensure that you don't include them in your training.

    You're attempting to predict the word or phrase from the tweet that exemplifies the provided sentiment. The word or phrase should include all characters within that span (i.e. including commas, spaces, etc.)

    Columns:

    1. textID - unique ID for each piece of text
    2. text - the text of the tweet
    3. sentiment - the general sentiment of the tweet

    Acknowledgement:

    The dataset is download from Kaggle Competetions:
    https://www.kaggle.com/c/tweet-sentiment-extraction/data?select=train.csv

    Objective:

    • Understand the Dataset & cleanup (if required).
    • Build classification models to predict the twitter sentiments.
    • Compare the evaluation metrics of vaious classification algorithms.
  5. Support data for Chatbots

    • kaggle.com
    zip
    Updated Feb 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohammad Faizan (2025). Support data for Chatbots [Dataset]. https://www.kaggle.com/datasets/mohammadfaizannaeem/3m-tweet-data-of-world-biggest-brands-on-twitter/data
    Explore at:
    zip(176765850 bytes)Available download formats
    Dataset updated
    Feb 26, 2025
    Authors
    Mohammad Faizan
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    File Description

    This dataset contains Twitter support conversations collected from various company accounts. It includes customer inquiries and corresponding support responses. The data is useful for training AI chatbots, analyzing customer service trends, and developing sentiment analysis models.

    Column Description

    This dataset contains customer support interactions on Twitter. It includes the following columns: tweet_id: A unique identifier for each tweet. author_id: The unique ID of the user who posted the tweet. inbound: A boolean value indicating whether the tweet is from a customer (True) or from the support team (False). created_at: The timestamp of when the tweet was posted (in UTC format). text: The content of the tweet. response_tweet_id: The unique ID of the response tweet, if applicable. in_response_to_tweet_id: The ID of the original tweet to which this tweet is responding.

    How This Data Can Be Used? Training a chatbot: Helps in generating automated support responses. Sentiment analysis: Can analyze whether tweets are complaints, queries, or feedback. Conversation tracking: By linking response tweets with original messages.

    originalAuthor : MANORAMA Source : https://www.kaggle.com/datasets/manovirat/aspect/data

    Note: This dataset is shared for educational and research purposes only.

  6. Twitter Customer Reviews of Popular Smart Phone

    • kaggle.com
    zip
    Updated Jun 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shibbir Ahmed Arif (2024). Twitter Customer Reviews of Popular Smart Phone [Dataset]. https://www.kaggle.com/datasets/shibbir282/twitter-customer-reviews-of-popular-smart-phone
    Explore at:
    zip(1236373 bytes)Available download formats
    Dataset updated
    Jun 8, 2024
    Authors
    Shibbir Ahmed Arif
    Description

    Context

    This dataset is a part of our research work titled "Opinion Mining of Customer Reviews Using Supervised Learning Algorithms". If you use this dataset then please cite our work. You can find the article in https://ieeexplore.ieee.org/document/9733435

    Content

    Nowadays, a lot of people express their opinions on various topics using social networking sites. Twitter has become a famous social networking site where people can express their opinions to the point and so it has become a great source for opinion mining. In this research, the goal was to train and build a model that can automatically and accurately categorize the opinion of customer tweet reviews about popular cell phone brands. We have used python TextBlob library for getting the polarity values of all the tweet reviews of the dataset. We have also used Support Vector Machine (SVM), Naïve Bayes, Logistic Regression, Decision Tree and Random Forest algorithms along with Bag of Words and TF-IDF vectorizers separately to train and build the model. We have investigated the opinions using five classes which are Strongly Positive, Positive, Neutral, Negative and Strongly Negative.

    When referencing this dataset please cite the below paper

    Bibtex @inproceedings{arif2021opinion, title={Opinion Mining of Customer Reviews Using Supervised Learning Algorithms}, author={Arif, Shibbir Ahmed and Hossain, Taslima Binte}, booktitle={2021 5th International Conference on Electrical Information and Communication Technology (EICT)}, pages={1--6}, year={2021}, organization={IEEE} }

  7. Saudi Customer Care Tweets

    • kaggle.com
    zip
    Updated Mar 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abdullah Alsharif (2024). Saudi Customer Care Tweets [Dataset]. https://www.kaggle.com/datasets/alshreefabdullh/saudi-customer-care-tweets
    Explore at:
    zip(10030314 bytes)Available download formats
    Dataset updated
    Mar 13, 2024
    Authors
    Abdullah Alsharif
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Area covered
    Saudi Arabia
    Description

    This data was collected from several customer care accounts as inquiries of the customers.

    "fullText": This variable contains the full-text content of the tweet. "lang": This variable indicates the language in which the tweet is written. "viewsCount": This variable represents the count of views or impressions the tweet has received. "bookmarkCount": This variable represents the count of times the tweet has been bookmarked by users. "favoriteCount": This variable represents the count of times the tweet has been favorited by users. "replyCount": This variable represents the count of replies the tweet has received. "retweetCount": This variable represents the count of times the tweet has been retweeted by users. "quoteCount": This variable represents the count of times the tweet has been quoted by users.

  8. Customer Support Tweets (945M rows)

    • kaggle.com
    zip
    Updated Oct 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Galal Qassas (2025). Customer Support Tweets (945M rows) [Dataset]. https://www.kaggle.com/datasets/galalqassas/customer-support-tweets-945m-rows
    Explore at:
    zip(74154613 bytes)Available download formats
    Dataset updated
    Oct 31, 2025
    Authors
    Galal Qassas
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Galal Qassas

    Released under MIT

    Contents

  9. customer care tweets KSA

    • kaggle.com
    zip
    Updated May 20, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mansour (2022). customer care tweets KSA [Dataset]. https://www.kaggle.com/datasets/mansourhussain/customer-care-tweets-ksa/discussion
    Explore at:
    zip(642212 bytes)Available download formats
    Dataset updated
    May 20, 2022
    Authors
    Mansour
    Area covered
    Saudi Arabia
    Description

    - this data contains 10000 tweets for a telecom company's customer care account on Twitter.

    - this data need to use in Sentiment Analysis in Arabic.

  10. Apple Tweet Dataset

    • kaggle.com
    zip
    Updated Mar 26, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    sup_tenshi (2022). Apple Tweet Dataset [Dataset]. https://www.kaggle.com/datasets/suptenshi/apple-tweet-dataset
    Explore at:
    zip(375032 bytes)Available download formats
    Dataset updated
    Mar 26, 2022
    Authors
    sup_tenshi
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset can be used for Sentiment Analysis which contains the tweets about apple products on twitter. This data set has basically 3 headers 1. tweet_text 2.emotion_in_tweet_is_directed_at 3.is_there_an_emotion_directed_at_a_brand_or_product

  11. COVID-19 Twitter Dataset

    • kaggle.com
    zip
    Updated Jul 4, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jingli SHI (2020). COVID-19 Twitter Dataset [Dataset]. https://www.kaggle.com/datasets/shijingli/covid19-twitter-dataset
    Explore at:
    zip(29449949 bytes)Available download formats
    Dataset updated
    Jul 4, 2020
    Authors
    Jingli SHI
    Description

    Context

    There are total of 20 CSV files including tweets related to COVID-19 from 20 March 2020 to 08 April 2020.

    Content

    For each file, the following columns are included. Columns: coordinates, created_at, hashtags, media, urls, favorite count, id, in_reply_to_screen_name, in_reply_to_status_id, in_reply_to_user_id.

    Acknowledgements

    We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.

    Inspiration

    Your data will be in front of the world's largest data science community. What questions do you want to see answered?

    Rights

    The relevant sections of Twitter's Terms of Service [1] and Developer Agreement [2]. ** According to Twitter's Developer Policy §6 [3]: "If you provide Content to third parties, including downloadable datasets of Content or an API that returns Content, you will only distribute or allow download of Tweet IDs and/or User IDs" and "any Content provided to third parties via non-automated file download remains subject to this Policy". [1] https://twitter.com/tos?lang=en [2] https://dev.twitter.com/overview/terms/agreement [3] https://dev.twitter.com/overview/terms/policy#6.Update_Be_a_Good_Partner_to_Twitter

  12. Bank customer tweets (10000)

    • kaggle.com
    zip
    Updated Sep 25, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bayode Ogunleye (2022). Bank customer tweets (10000) [Dataset]. https://www.kaggle.com/datasets/batoog/bank-customer-tweets-10000
    Explore at:
    zip(563396 bytes)Available download formats
    Dataset updated
    Sep 25, 2022
    Authors
    Bayode Ogunleye
    Description

    If you use this dataset, Please ensure you reference accordingly. Kindly see reference below.

    Ogunleye, B. O. (2021). Statistical learning approaches to sentiment analysis in the Nigerian banking context (Doctoral dissertation, Sheffield Hallam University).

  13. Twitter dataset of Facebook/Meta

    • kaggle.com
    zip
    Updated Apr 10, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Haber (2022). Twitter dataset of Facebook/Meta [Dataset]. https://www.kaggle.com/datasets/haber0322/twitter-dataset-of-facebook
    Explore at:
    zip(1113136 bytes)Available download formats
    Dataset updated
    Apr 10, 2022
    Authors
    Haber
    License

    Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
    License information was derived automatically

    Description

    Public dataset that everyone can use Creating a dataframe from the tweets list above. E-mail supervision In order to keep a regular discussion going, it is useful to use e-mail. There are many distance discussions that take place by e-mail or fax, backed up with some visits for full-blown supervisions. If the supervisor is in another country, then e-mail contact is essential, as the face-to-face supervisory contacts will be condensed into the periods when you can both be in the same country. Make e-mail contacts lucid, short and precise, with some friendly tone to establish a personal touch. Try not to get involved in excessively chatty discussions but concentrate on asking questions, seeking information and reporting on findings for comment. E-mail is quite an insistent medium. If you make contact too frequently, the supervisor will feel harassed. If you make contact too infrequently, the supervisor will feel guilty (and so will you), wondering what you are up to. Regular brief contact with some very full discussions on work in progress at regular intervals will maintain a sense of a working relationship over time and space.

  14. Brand Sentiment Analysis Dataset (Twitter)

    • kaggle.com
    zip
    Updated Jan 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tushar Paul (2024). Brand Sentiment Analysis Dataset (Twitter) [Dataset]. https://www.kaggle.com/datasets/tusharpaul2001/brand-sentiment-analysis-dataset
    Explore at:
    zip(375745 bytes)Available download formats
    Dataset updated
    Jan 7, 2024
    Authors
    Tushar Paul
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset description Users assessed tweets related to various brands and products, providing evaluations on whether the sentiment conveyed was positive, negative, or neutral. Additionally, if the tweet conveyed any sentiment, contributors identified the specific brand or product targeted by that emotion.

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F11965067%2Fa48606bfcaf80acebbb6edff7895484a%2Fdownload.png?generation=1704673111671747&alt=media" alt="">

    Train Dataset : 8589 rows x 3 columns https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F11965067%2Fe998ba81ca461699a787ff7305486b24%2FTrainDS.JPG?generation=1704672608361793&alt=media" alt="">

    Test Dataset : 504 rows x 1 columns https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F11965067%2F07df18965e91f84df123270aabb641e1%2Ftest.JPG?generation=1704679582009718&alt=media" alt="">

  15. Twitter Airline Sentiment Dataset

    • kaggle.com
    zip
    Updated Nov 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chandana Ramakrishna (2025). Twitter Airline Sentiment Dataset [Dataset]. https://www.kaggle.com/datasets/chandana890/twitter-airline-sentiment-dataset
    Explore at:
    zip(1134990 bytes)Available download formats
    Dataset updated
    Nov 14, 2025
    Authors
    Chandana Ramakrishna
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Overview

    This dataset contains tweets related to major US airlines and is widely used for NLP and sentiment analysis tasks. Each record includes the tweet text, timestamp, airline name, and sentiment label (positive, negative, neutral). This uploaded version is prepared to support advanced text processing, machine learning, and anomaly detection experiments.

    What's Included

    • Tweets.csv – Full collection of airline-related tweets
    • Text content suitable for NLP tasks
    • Timestamp information (useful for time-based analysis)
    • Sentiment labels for classification and evaluation
    • Cleaned text field for direct use in ML pipelines

    Purpose of This Dataset

    This dataset is used in a machine learning workflow focused on: - sentiment analysis
    - embedding generation (transformers)
    - dimensionality reduction (PCA, UMAP)
    - clustering and visualization
    - unsupervised anomaly detection using Isolation Forest

    It is especially suited for exploring changes in public sentiment, event detection, and contextual analysis in social media data.

    Key Use Cases

    • Building and testing NLP models
    • Semantic similarity and embedding-based analysis
    • Sentiment classification
    • Detecting anomalous posts or time periods
    • Visualizing tweet clusters using UMAP
    • Studying customer feedback patterns in the airline industry

    Source

    Originally derived from the Twitter US Airline Sentiment dataset on Kaggle.
    This uploaded version is intended for educational, analytical, and research purposes.

    Notes

    If you're using this dataset in a notebook, ensure you update your file path accordingly: ```python df = pd.read_csv("/kaggle/input/twitter-airline-sentiment-dataset/Tweets.csv")

  16. Drug-related Tweets Dataset

    • kaggle.com
    zip
    Updated Sep 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Techno Care (2025). Drug-related Tweets Dataset [Dataset]. https://www.kaggle.com/datasets/technocare/drug-related-tweets-dataset
    Explore at:
    zip(9522011 bytes)Available download formats
    Dataset updated
    Sep 17, 2025
    Authors
    Techno Care
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This dataset contains drug-related text entries structured to resemble tweets. It was generated from the drugsComTest_raw.csv dataset, which originally included patient reviews of medications. Source: Extracted from patient-submitted reviews on Drugs.com. Format: CSV file with two columns: drugName – the name of the drug mentioned. tweet – the review text reformatted to simulate a tweet-like message.

    Purpose: To support Natural Language Processing (NLP) tasks such as sentiment analysis, drug-effect classification, and social media mining. To act as a proxy dataset for training or testing models on drug-related discussions, where actual Twitter data collection is restricted or unavailable.

    Limitations: Not real Twitter data, but synthetic tweets generated from formal drug reviews. May differ in tone and structure compared to actual tweets.

  17. Famous Keyword Twitter Replies

    • kaggle.com
    zip
    Updated Jun 7, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    _w1998 (2023). Famous Keyword Twitter Replies [Dataset]. https://www.kaggle.com/datasets/jackksoncsie/famous-keyword-twitter-replies-dataset
    Explore at:
    zip(7691153 bytes)Available download formats
    Dataset updated
    Jun 7, 2023
    Authors
    _w1998
    License

    http://www.gnu.org/licenses/fdl-1.3.htmlhttp://www.gnu.org/licenses/fdl-1.3.html

    Description

    The "Famous Keyword Twitter Replies Dataset" is a comprehensive collection of Twitter data that focuses on popular keywords and their associated replies. This dataset contains five essential columns that provide valuable insights into the Twitter conversation dynamics:

    1. Keyword: This column represents the specific keyword or topic of interest that generated the original tweet. It helps identify the context or subject matter around which the conversation revolves.

    2. Main_tweet: The main_tweet column contains the original tweet related to the keyword. It serves as the starting point or focal point of the conversation and often provides essential information or opinions on the given topic.

    3. Main_likes: This column provides the number of likes received by the main_tweet. Likes serve as a measure of engagement and indicate the level of popularity or resonance of the original tweet within the Twitter community.

    4. Reply: The reply column consists of the replies or responses to the main_tweet. These replies may include comments, opinions, additional information, or discussions related to the keyword or the original tweet itself. The replies help capture the diverse perspectives and conversations that emerge in response to the main_tweet.

    5. Reply_likes: This column records the number of likes received by each reply. Similar to the main_likes column, the reply_likes column measures the level of engagement and popularity of individual replies. It enables the identification of particularly noteworthy or well-received replies within the dataset.

    By analyzing this "Famous Keyword Twitter Replies Dataset," researchers, analysts, and data scientists can gain valuable insights into how popular keywords spark discussions on Twitter and how these discussions evolve through replies.

    The dataset's information on likes allows for the evaluation of tweet and reply popularity, helping to identify influential or impactful content.

    This dataset serves as a valuable resource for various applications, including sentiment analysis, trend identification, opinion mining, and understanding social media dynamics.

    Number of tweets for each pairs of tweet and reply

    Total has 17255 pairs of tweet/reply

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F9998584%2Fc33bf662ec0b710877ba40287bc6025e%2Fcount.png?generation=1686152411950305&alt=media" alt="">

  18. Celebrity Tweets Dataset (Real vs AI-Generated)

    • kaggle.com
    zip
    Updated Apr 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gor Abaghyan (2025). Celebrity Tweets Dataset (Real vs AI-Generated) [Dataset]. https://www.kaggle.com/datasets/abaghyangor/celebrity-tweets
    Explore at:
    zip(3147 bytes)Available download formats
    Dataset updated
    Apr 18, 2025
    Authors
    Gor Abaghyan
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This dataset contains 120 tweets from 8 popular celebrities. Each tweet is labeled as either real (actually posted by the celebrity) or fake (AI-generated to mimic their tone). It was originally built for a hackathon project called TweetLike, but was later cleaned and restructured to support machine learning projects.

    The goal is to help explore how writing style, tone, and voice can be modeled — and how convincingly AI can imitate real people online. The dataset is ideal for NLP experiments like author prediction, fake vs real classification, and stylistic analysis.

    To keep the dataset balanced and ML-friendly, we ensured that each celebrity has exactly 15 tweets. In some cases, controlled oversampling was used to meet this count — this is intentional to support fair training of models across classes.

  19. Sentiment with 1.6 million tweets with locations

    • kaggle.com
    zip
    Updated Mar 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    vivek chary (2023). Sentiment with 1.6 million tweets with locations [Dataset]. https://www.kaggle.com/datasets/vivekchary/sentiment-with-16-million-tweets-with-locations
    Explore at:
    zip(86959692 bytes)Available download formats
    Dataset updated
    Mar 12, 2023
    Authors
    vivek chary
    Description

    The "Sentiment with 16 million tweets with locations" dataset is a collection of tweets with their respective geographical location information and sentiment labels. The dataset includes 16 million tweets from various locations around the world, spanning a period of several years. The sentiment labels for each tweet are binary, indicating whether the sentiment expressed in the tweet is positive or negative.

    This dataset can be used for sentiment analysis and natural language processing tasks, such as training machine learning models to classify the sentiment of text data. Researchers and developers can use this dataset to analyze trends in sentiment across different locations and time periods, as well as to develop new algorithms and models for sentiment analysis.

    Please note that this dataset is intended for research purposes only and should not be used for any commercial or legal applications. The dataset may also contain offensive or inappropriate language, and users should exercise caution when working with this data

    Context In addition to the technical details of the "Sentiment with 16 million tweets with locations" dataset, some context that may be relevant to include in the About Dataset section could be:

    • The dataset was compiled and made publicly available by Vivek Chary, a data scientist and machine learning engineer.
    • The tweets were collected using the Twitter API, and the dataset was last updated in 2017.
    • The dataset includes tweets in various languages, although the majority are in English.
    • Sentiment analysis is a common application of natural language processing, and has a wide range of potential use cases, such as in market research, social media monitoring, and customer service.
    • Sentiment analysis can be challenging due to the complexity and ambiguity of language, as well as the variability of individual expression and context.

    • Large datasets like this one are important for developing accurate and robust sentiment analysis models, as they provide a diverse and representative sample of real-world text data.

    Content It contains the following 7 fields:

    1. Sentiment Target: The polarity of the tweet, indicated by a numeric value of 0 (negative), 2 (neutral), or 4 (positive).

    2. Tweet ID: The unique identifier of the tweet.

    3. Date: The date and time the tweet was posted in Coordinated Universal Time (UTC) format.

    4. Query Flag: The keyword or phrase used to filter the tweets. If no query was used, the value is NO_QUERY.

    5. User: The username of the Twitter account that posted the tweet.

    6. Text: The actual text content of the tweet.

    7. Location: The location of the tweet

  20. Twitter Stocks Dataset

    • kaggle.com
    zip
    Updated Nov 7, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MaharshiPandya (2022). Twitter Stocks Dataset [Dataset]. https://www.kaggle.com/datasets/maharshipandya/twitter-stocks-dataset
    Explore at:
    zip(44765 bytes)Available download formats
    Dataset updated
    Nov 7, 2022
    Authors
    MaharshiPandya
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Content

    This is a dataset of Twitter stock prices over a range of 9 years. The stock prices' date ranges from November 2013 to October 2022. The data is in CSV format which is tabular and can be loaded quickly.

    Usage

    The dataset can be used for:

    • Time Series Analysis of the stock prices
    • Forecasting whether the stock will go into an uptrend or downtrend
    • Finding underlying patterns or trends
    • Any other application that you can think of. Feel free to discuss!

    Column Description

    There are 7 columns in this dataset.

    Note: The currency is in USD ($)

    • Date: The date for which the stock data is considered.
    • Open: The stock's opening price on that day.
    • High: The stock's highest price on that day.
    • Low: The stock's lowest price on that day.
    • Close: The stock's closing price on that day. The close price is adjusted for splits.
    • Adj Close: Adjusted close price adjusted for splits and dividend and/or capital gain distributions.
    • Volume: Volume measures the number of shares traded in a stock or contracts traded in futures or options.

    Acknowledgement

    Image credits: IndiaTimes

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Muhammad Asif (2025). Customer Support Twitter Data [Dataset]. https://www.kaggle.com/datasets/muhammadasif786/customer-support-twitter-data
Organization logo

Customer Support Twitter Data

Explore at:
7 scholarly articles cite this dataset (View in Google Scholar)
zip(176765850 bytes)Available download formats
Dataset updated
Aug 29, 2025
Authors
Muhammad Asif
License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

Dataset

This dataset was created by Muhammad Asif

Released under Apache 2.0

Contents

Search
Clear search
Close search
Google apps
Main menu