100+ datasets found

Twitter Tweets Sentiment Dataset
kaggle.com
zip
Updated Apr 8, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
M Yasser H (2022). Twitter Tweets Sentiment Dataset [Dataset]. https://www.kaggle.com/datasets/yasserh/twitter-tweets-sentiment-dataset
Explore at:
zip(1289519 bytes)Available download formats
Dataset updated
Apr 8, 2022
Authors
M Yasser H
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
https://raw.githubusercontent.com/Masterx-AI/Project_Twitter_Sentiment_Analysis_/main/twitt.jpg" alt="">

Description:

Twitter is an online Social Media Platform where people share their their though as tweets. It is observed that some people misuse it to tweet hateful content. Twitter is trying to tackle this problem and we shall help it by creating a strong NLP based-classifier model to distinguish the negative tweets & block such tweets. Can you build a strong classifier model to predict the same?

Each row contains the text of a tweet and a sentiment label. In the training set you are provided with a word or phrase drawn from the tweet (selected_text) that encapsulates the provided sentiment.

Make sure, when parsing the CSV, to remove the beginning / ending quotes from the text field, to ensure that you don't include them in your training.

You're attempting to predict the word or phrase from the tweet that exemplifies the provided sentiment. The word or phrase should include all characters within that span (i.e. including commas, spaces, etc.)

Columns:

textID - unique ID for each piece of text

text - the text of the tweet

sentiment - the general sentiment of the tweet

Acknowledgement:

The dataset is download from Kaggle Competetions:
https://www.kaggle.com/c/tweet-sentiment-extraction/data?select=train.csv

Objective:

Understand the Dataset & cleanup (if required).

Build classification models to predict the twitter sentiments.

Compare the evaluation metrics of vaious classification algorithms.
In-Depth Twitter Retweet Analysis Dataset
kaggle.com
zip
Updated Jul 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mulenga Kawimbe (2024). In-Depth Twitter Retweet Analysis Dataset [Dataset]. https://www.kaggle.com/datasets/mulengakawimbe89/in-depth-twitter-retweet-analysis-dataset
Explore at:
zip(51790 bytes)Available download formats
Dataset updated
Jul 30, 2024
Authors
Mulenga Kawimbe
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
In-Depth Twitter Retweet Analysis Dataset

Dataset Overview

This dataset provides an extensive analysis of Twitter retweet activities, focusing on various attributes that can influence and describe the nature of retweets. It consists of multiple rows of data, each representing a unique Twitter retweet instance with detailed information on its characteristics.

Dataset Columns

Weekday: The day of the week when the retweet occurred.

Example values: "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"

Hour: The hour of the day when the retweet was made, in 24-hour format.

Example values: 0, 1, 2, ..., 23

Day: The day of the month when the retweet was posted.

Example values: 1, 2, 3, ..., 31

Lang: The language code of the tweet that was retweeted.

Example values: "en" (English), "es" (Spanish), "fr" (French)

Reach: The estimated number of users who have seen the retweet.

RetweetCount: The number of times the retweeted tweet has been retweeted further.

Likes: The number of likes received by the retweeted tweet.

Klout: The Klout score of the user who posted the original tweet, which is a measure of their influence on social media.

Sentiment: The sentiment score of the retweeted tweet, indicating the overall emotional tone.

Example values: -1.0 (very negative), 0.0 (neutral), 1.0 (very positive)

LocationID: A numerical identifier representing the geographical location of the user who posted the retweet.

Usage

This dataset can be utilized for various analyses, including: - Identifying peak times for retweets - Analyzing the influence of tweet attributes on retweet rates - Sentiment analysis of popular retweets - Geographical distribution of retweet activity - Correlating Klout scores with retweet reach and engagement

Applications

Researchers, marketers, and social media analysts can use this dataset to gain insights into Twitter retweet behavior, optimize social media strategies, and understand the factors contributing to the virality of tweets.
s
Twitter Revenue Growth
searchlogistics.com
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Twitter Revenue Growth [Dataset]. https://www.searchlogistics.com/learn/statistics/twitter-user-statistics/
Explore at:
Dataset updated
Apr 1, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Advertising makes up 89% of its total revenue and data licensing makes up about 11%.
Unleashing Social Sentiments: A Twitter Analysis
kaggle.com
zip
Updated Feb 27, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Joy Shil (2023). Unleashing Social Sentiments: A Twitter Analysis [Dataset]. https://www.kaggle.com/datasets/joyshil0599/unleashing-social-sentiments-a-twitter-analysis
Explore at:
zip(404155 bytes)Available download formats
Dataset updated
Feb 27, 2023
Authors
Joy Shil
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
"Unleashing Social Sentiments: A Twitter Analysis" appears to be a study or analysis that uses a Twitter dataset to explore the sentiment and opinions of Twitter users towards a particular topic or set of topics. Without more information about the study, it is difficult to provide a detailed analysis. However, based on the title and the use of a Twitter dataset, it is likely that the study involves the use of sentiment analysis techniques to analyze the opinions and sentiment expressed in the dataset. https://camo.githubusercontent.com/7bf6f8c804cf1ec62e2cbbc7c85ea7dfd65b4848df48be4218e24012c6eb3430/68747470733a2f2f692e6d6f72696f682e636f6d2f323032302f30322f30342f6265656633366664373037642e6a7067">

The use of Twitter data for sentiment analysis has become increasingly popular in recent years due to the massive volume of data available and the ease with which opinions and sentiment can be expressed on the platform. By analyzing Twitter data, researchers can gain insights into public opinion and sentiment on a wide range of topics, from politics to consumer products to social issues.

To conduct a Twitter analysis, researchers typically collect a dataset of tweets related to a particular topic or set of topics. This dataset may include features such as the Twitter username, the tweet content, the time and date of the tweet, and any associated metadata such as hashtags or mentions. The dataset can then be processed using NLP or sentiment analysis techniques to classify the sentiment expressed in each tweet as positive, negative, or neutral.

The dataset contains tweets from the Twitter API that were scraped for seven hashtags:

#Messi: This hashtag refers to the Argentine soccer superstar Lionel Messi, and is commonly used by fans and followers to discuss his performances, accomplishments, and news related to his career.

#FIFAWorldCup: This hashtag is used during the FIFA World Cup, a quadrennial international soccer tournament. Tweets with this hashtag may discuss news, scores, or analysis related to the tournament.

#DeleteFacebook: This hashtag is used by people who advocate for deleting or boycotting Facebook, often in response to controversies related to data privacy, political advertising, or other issues related to the social media giant.

#MeToo: This hashtag is used in the context of the Me Too movement, a social movement against sexual harassment and assault, particularly in the workplace. Tweets with this hashtag may share personal stories, express support for the movement, or discuss related news and events.

#BlackLivesMatter: This hashtag is used in the context of the Black Lives Matter movement, a movement against police brutality and systemic racism towards Black people. Tweets with this hashtag may express support for the movement, share news and updates, or discuss related issues.

#NeverAgain: This hashtag is used in the context of the Never Again movement, which advocates for gun control and other measures to prevent school shootings and other acts of gun violence.

#BarCamp: This hashtag refers to BarCamp, an international network of unconferences - participant-driven conferences that are open and free to attend. Tweets with this hashtag may discuss upcoming BarCamp events, share insights or learnings from past events, or express support for the BarCamp community.

The sentiment score was generated using a pre-trained sentiment analysis model, and represents the overall sentiment of the tweet (positive, negative, or neutral).

The data can be used to gain insights into how people are discussing and reacting to these topics on Twitter, and how the sentiment towards these hashtags may have evolved over time. Researchers and analysts can use this dataset for sentiment analysis, natural language processing, and machine learning applications.

Some potential analyses that can be performed on the data include sentiment trend analysis over time, geographical distribution of sentiments, and topic modeling to identify themes and topics that emerge from the tweets.

Overall, the dataset provides a rich resource for researchers and analysts interested in studying social and political issues on social media.
Top 1000 Twitter Celebrity Tweets And Embeddings
kaggle.com
zip
Updated Jul 12, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ahmed Shahriar Sakib (2022). Top 1000 Twitter Celebrity Tweets And Embeddings [Dataset]. https://www.kaggle.com/datasets/ahmedshahriarsakib/top-1000-twitter-celebrity-tweets-embeddings
Explore at:
zip(179140866 bytes)Available download formats
Dataset updated
Jul 12, 2022
Authors
Ahmed Shahriar Sakib
Description
Context

This dataset contains tweets and embeddings of the top 1000 Twitter celebrity accounts

Content

Tweets -

Contains 915 celebrity users' tweets

The tweets were scraped using tweepy - a python wrapper of Twitter API; following this celebrity list

Top 1000 Twitter Celebrity Accounts

Embeddings -

A single CSV file containing embeddings of all tweets (each user per row)

After preprocessing the tweets, the tweets were embedded using sentence-transformers pre-trained multilingual model - paraphrase-multilingual-MiniLM-L12-v2.

You can also find the embedding data here - Twitter Celebrity Embed Data

NB: - There are almost 10% of the Twitter accounts were private, changed their username, or suspended. In the end, the number of users remains 915. - There are some unofficial Celebrity accounts (ex - twitter.com/sonunigam) with a very small amount of tweets. We can filter those users based on their tweet count. Here is a good research paper on this topic - 25 Tweets to Know You: A New Model to Predict Personality with Social Media

Featured Notebook

Twitter Celebrity Matcher (SBERT+ Tweepy)

Live App

GitHub Project

ahmedshahriar/TwitterCelebrityMatcher

Download

kaggle API Command

!kaggle datasets download -d ahmedshahriarsakib/top-1000-twitter-celebrity-tweets-embeddings

Disclaimer

The tweets which were scraped are all publicly available and it's intended for educational purposes only.

Acknowledgement

Cover image credit - bestfunquiz- Which Celebrity On Twitter Should Follow You
Tweets and User Engagement
kaggle.com
zip
Updated Dec 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). Tweets and User Engagement [Dataset]. https://www.kaggle.com/thedevastator/tweets-and-user-engagement
Explore at:
zip(9121838 bytes)Available download formats
Dataset updated
Dec 6, 2023
Authors
The Devastator
Description
Tweets and User Engagement

Twitter Data: Tweet Characteristics and Engagement Metrics

By Krystal Jensen [source]

About this dataset

The dataset Twitter Data: Tweets and User Interactions provides comprehensive information about tweets and user interactions on the popular social media platform Twitter. The dataset includes various attributes that shed light on the characteristics and engagement metrics of tweets, allowing for in-depth analysis of user behavior and content performance.

One of the key variables in this dataset is the Klout score, which represents the influence and reputation of the Twitter users who posted the tweets. This numeric metric helps assess the impact a user has on their audience and provides insights into their social media presence.

Another essential attribute is the text content of each tweet. By examining this textual data, analysts can uncover valuable information about trending topics, opinions, sentiments, conversations, or news shared by users. It serves as a primary source for understanding what people share publicly on Twitter.

The dataset Twitter+data+in+sheets.csv serves as a reliable resource for conducting research or performing analytics that require detailed information about Twitter activity. It covers aspects such as tweet characteristics (including length and language), engagement metrics (such as retweets and favorites), sentiment analysis (revealing positive or negative emotions expressed), as well as individual user details.

By utilizing this extensive dataset, researchers can gain valuable insights into patterns of online communication within Twitter's vast network. They can identify influential individuals with high Klout scores who have substantial reach among their followers or communities. Additionally, they can analyze various aspects related to tweet content such as sentiment analysis to understand public opinion trends or measure engagement levels through counts like retweets and favorites.

Overall, this dataset serves as an invaluable resource for anyone interested in comprehensively analyzing tweets' characteristics, exploring how users interact with them across different dimensions like popularity or sentiment analysis groups—or examining correlations between Klout scores with other factors influencing engagement levels like time posted

How to use the dataset

Welcome to the Twitter Data: Tweets and User Interactions dataset! This dataset provides valuable insights into tweet characteristics and user engagement on Twitter. Here is a useful guide on how to make the most out of this dataset:

Understanding the Columns: There are two main columns in this dataset:

Klout Score (Numeric): The Klout score indicates the influence of the user who posted the tweet. A higher Klout score suggests greater influence and reach.

Text Content of Tweet (Text): This column contains the actual text content of each tweet.

Analyzing Tweet Characteristics: The text content column will help you understand various aspects of tweets, such as language, sentiment, trending topics, or specific keywords used by users. You can perform text analysis techniques like word frequency analysis or sentiment analysis to gain insights into tweet characteristics.

Examining User Engagement: The Klout score provides a measure of user influence on Twitter. By analyzing this column, you can identify highly influential users who generate higher engagement rates with their tweets. You can further explore interactions (likes, retweets, replies) between these influential users and other Twitter users mentioned in their tweets.

Identifying Trends and Patterns: With this dataset's rich information about tweet content and user engagement, you can identify popular trends or patterns among highly engaged tweets or influential users over different time periods.

Remember that dates are not included in this guide since they were not provided in the original request for creating it.

Please note that it is essential to responsibly use this data for any analysis or research purposes while adhering to ethical considerations related to privacy rights and data usage policies set by both Kaggle platform rules as well as any relevant privacy regulations.

Best regards, [Your Name]

Research Ideas

Analyzing the relationship between Klout score and the content of tweets: This dataset can be used to investigate whether there is a correlation between a user's Klout score (a measure of their social media influence) and the characteristics of their tweets. By examining factors such as tweet length, sentiment, and engagement metrics, researchers can gain...
Twitter users in the United States 2019-2028
statista.com
Updated Jul 30, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista Research Department (2025). Twitter users in the United States 2019-2028 [Dataset]. https://www.statista.com/topics/3196/social-media-usage-in-the-united-states/
Explore at:
Dataset updated
Jul 30, 2025
Dataset provided by
Statistahttp://statista.com/
Authors
Statista Research Department
Area covered
United States
Description
The number of Twitter users in the United States was forecast to continuously increase between 2024 and 2028 by in total 4.3 million users (+5.32 percent). After the ninth consecutive increasing year, the Twitter user base is estimated to reach 85.08 million users and therefore a new peak in 2028. Notably, the number of Twitter users of was continuously increasing over the past years.User figures, shown here regarding the platform twitter, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Twitter users in countries like Canada and Mexico.
Greekgodx Tweets: Analyzing Conversation
kaggle.com
zip
Updated Dec 27, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2022). Greekgodx Tweets: Analyzing Conversation [Dataset]. https://www.kaggle.com/datasets/thedevastator/greekgodx-tweets-analyzing-conversation-interact
Explore at:
zip(455339 bytes)Available download formats
Dataset updated
Dec 27, 2022
Authors
The Devastator
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Greekgodx Tweets: Analyzing Conversation Interactions on Social Media

Retweets, Likes, and Mentions

By Twitter [source]

About this dataset

This dataset provides a unique opportunity to unravel the intricacies of a conversational exchange on social media platforms, by exploring the complex interplay between retweets, likes, mentions and replies. Greekgodx is an immensely popular Twitch streamer and YouTuber, whose tweets offer invaluable insights into how people interact with each other on social media networks. Through this data set we can gain an understanding of user engagement levels, the influence of certain topics or interests on conversations, as well as explore new techniques for measuring sentiment in social media conversations. With these tools in hand we will be better equipped to interpret popular conversations occurring online and more confidently make decisions based upon insights gleaned from our analysis

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

How to use this dataset

This dataset is a useful resource for those wanting to explore and analyze the conversational dynamics that occur on social media platforms. It includes tweets from popular Twitch streamer and YouTuber, Greekgodx, whose content often inspires engagement from his followers as well as other online users. Here you will find various columns that provide an opportunity to investigate this data in a number of ways, such as investigating any retweets or likes he receives in response to his tweets or the mentions he gets from other users.

The data included here consists of four columns: id, tweet_text, timestamp, retweets_count, likes_count and mentions. All of these features help you gain insights into different elements of interaction between Greekgodx and other Twitter users by providing information about when particular tweets were published (timestamp), how many people have engaged with them (retweets count/likes count) or what kind of people are talking about him (mentions). Additionally the id column provides an identifier for each tweet which can be used for further analysis if needed.

To effectively work with this data set one could first use basic visualization techniques like histograms or bar plots to identify any initial trends related to how often Greekgodx is retweeted/liked within certain periods of time or which Twitter users mention him more frequently. Additionally more advanced analysis techniques suchas direct network analysis can be used too if one seeks more detailed insights into relationships between different members on the platform – these could suggest which individuals are most influential in terms replicating content posted by Greek god x or who are most active when engaging with him in conversations publicly on Twitter

Research Ideas

Analyzing the Impact of Tweets on Popularity: This dataset can be used to analyze how Greekgodx’s tweets are affecting his popularity and viewership, by looking at engagement metrics such as retweets, likes and mentions over time.

Exploring Network Dynamics: The dataset can be used to explore the network dynamics of conversations taking place on Twitter, by examining relationships between replies, retweets, likes and mentions over time.

Investigating Sentiment Analysis of Tweets: This dataset provides a great opportunity to understand sentiment analysis on social media platforms by analyzing the sentiment associated with Greekgodx’s tweets using natural language processing techniques (NLP) and understanding how it affects his engagement levels with followers through retweets, likes, mention etc

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

Acknowledgements

If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit Twitter.
Sentiment Analysis on Financial Tweets
kaggle.com
zip
Updated Sep 5, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vivek Rathi (2019). Sentiment Analysis on Financial Tweets [Dataset]. https://www.kaggle.com/datasets/vivekrathi055/sentiment-analysis-on-financial-tweets
Explore at:
zip(2538259 bytes)Available download formats
Dataset updated
Sep 5, 2019
Authors
Vivek Rathi
License
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Description
Context

The following information can also be found at https://www.kaggle.com/davidwallach/financial-tweets. Out of curosity, I just cleaned the .csv files to perform a sentiment analysis. So both the .csv files in this dataset are created by me.

Anything you read in the description is written by David Wallach and using all this information, I happen to perform my first ever sentiment analysis.

"I have been interested in using public sentiment and journalism to gather sentiment profiles on publicly traded companies. I first developed a Python package (https://github.com/dwallach1/Stocker) that scrapes the web for articles written about companies, and then noticed the abundance of overlap with Twitter. I then developed a NodeJS project that I have been running on my RaspberryPi to monitor Twitter for all tweets coming from those mentioned in the content section. If one of them tweeted about a company in the stocks_cleaned.csv file, then it would write the tweet to the database. Currently, the file is only from earlier today, but after about a month or two, I plan to update the tweets.csv file (hopefully closer to 50,000 entries.

I am not quite sure how this dataset will be relevant, but I hope to use these tweets and try to generate some sense of public sentiment score."

Content

This dataset has all the publicly traded companies (tickers and company names) that were used as input to fill the tweets.csv. The influencers whose tweets were monitored were: ['MarketWatch', 'business', 'YahooFinance', 'TechCrunch', 'WSJ', 'Forbes', 'FT', 'TheEconomist', 'nytimes', 'Reuters', 'GerberKawasaki', 'jimcramer', 'TheStreet', 'TheStalwart', 'TruthGundlach', 'Carl_C_Icahn', 'ReformedBroker', 'benbernanke', 'bespokeinvest', 'BespokeCrypto', 'stlouisfed', 'federalreserve', 'GoldmanSachs', 'ianbremmer', 'MorganStanley', 'AswathDamodaran', 'mcuban', 'muddywatersre', 'StockTwits', 'SeanaNSmith'

Acknowledgements

The data used here is gathered from a project I developed : https://github.com/dwallach1/StockerBot

Inspiration

I hope to develop a financial sentiment text classifier that would be able to track Twitter's (and the entire public's) feelings about any publicly traded company (and cryptocurrency)
X/Twitter: Countries with the largest audience 2025
statista.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista, X/Twitter: Countries with the largest audience 2025 [Dataset]. https://www.statista.com/statistics/242606/number-of-active-twitter-users-in-selected-countries/
Explore at:
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Oct 2025
Area covered
Worldwide
Description
As of October 2025, social network X (formerly known as Twitter) was most popular in the United States, with an audience reach of approximately 99.04 million users. Japan ranked second, recording more than 71 million users on the platform. Global Twitter usage As of the second quarter of 2021, X/Twitter had 206 million monetizable daily active users worldwide. The most-followed Twitter accounts include figures such as Elon Musk, Justin Bieber and former U.S. president Barack Obama. X/Twitter and politics X/Twitter has become an increasingly relevant tool in domestic and international politics. The platform has become a way to promote policies and interact with citizens and other officials, and most world leaders and foreign ministries have an official Twitter account. Former U.S. president Donald Trump used to be a prolific Twitter user before the platform permanently suspended his account in January 2021. During an August 2018 survey, 61 percent of respondents stated that Trump's use of Twitter as President of the United States was inappropriate.
s
Twitter Key Statistics
searchlogistics.com
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Twitter Key Statistics [Dataset]. https://www.searchlogistics.com/learn/statistics/twitter-user-statistics/
Explore at:
Dataset updated
Apr 1, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
These are the key Twitter user statistics that you need to know.
s
Twitter Users Broken down By Country
searchlogistics.com
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Twitter Users Broken down By Country [Dataset]. https://www.searchlogistics.com/learn/statistics/twitter-user-statistics/
Explore at:
Dataset updated
Apr 1, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The US has historically been the target country for Twitter since its launch in 2006. This is the full breakdown of Twitter users by country.
S
Twitter Users Statistics 2025: Monthly Active Users, Regional Data & More
sqmagazine.co.uk
Updated Oct 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SQ Magazine (2025). Twitter Users Statistics 2025: Monthly Active Users, Regional Data & More [Dataset]. https://sqmagazine.co.uk/twitter-users-statistics/
Explore at:
Dataset updated
Oct 1, 2025
Dataset authored and provided by
SQ Magazine
License
https://sqmagazine.co.uk/privacy-policy/https://sqmagazine.co.uk/privacy-policy/
Time period covered
Jan 1, 2024 - Dec 31, 2025
Area covered
Global
Description
In early 2025, something fascinating happened at a small community center in suburban Ohio. A town hall meeting about local road closures suddenly went viral, not because of the topic, but because a 74-year-old attendee live-tweeted the entire event using her iPad. Within hours, her posts racked up thousands of...
s
Why Do People Use Twitter?
searchlogistics.com
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Why Do People Use Twitter? [Dataset]. https://www.searchlogistics.com/learn/statistics/twitter-user-statistics/
Explore at:
Dataset updated
Apr 1, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
One of the biggest advantages of Twitter is the speed at which information can be passed around. People use Twitter primarily to get news and for entertainment. This is the breakdown of why people use Twitter today.
X/Twitter: platform manipulation and spam actions H2 2024
statista.com
Updated Oct 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista Research Department (2024). X/Twitter: platform manipulation and spam actions H2 2024 [Dataset]. https://www.statista.com/topics/737/twitter/
Explore at:
Dataset updated
Oct 16, 2024
Dataset provided by
Statistahttp://statista.com/
Authors
Statista Research Department
Description
Between July and December 2024, over 335 million accounts on X (formerly Twitter) were suspended for reasons of spam or platform manipulation. User-informed labels were added to 66 million posts after being reported for spam.
d
Data from: Twitter Big Data as A Resource For Exoskeleton Research: A...
search.dataone.org
dataverse.harvard.edu
Updated Nov 8, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Thakur, Nirmalya (2023). Twitter Big Data as A Resource For Exoskeleton Research: A Large-Scale Dataset of about 140,000 Tweets and 100 Research Questions [Dataset]. http://doi.org/10.7910/DVN/VPPTRF
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/VPPTRF
Dataset updated
Nov 8, 2023
Dataset provided by
Harvard Dataverse
Authors
Thakur, Nirmalya
Description
Please cite the following paper when using this dataset: N. Thakur, “Twitter Big Data as a Resource for Exoskeleton Research: A Large-Scale Dataset of about 140,000 Tweets and 100 Research Questions,” Preprints, 2022, DOI: 10.20944/preprints202206.0383.v1 Abstract The exoskeleton technology has been rapidly advancing in the recent past due to its multitude of applications and use cases in assisted living, military, healthcare, firefighting, and industries. With the projected increase in the diverse uses of exoskeletons in the next few years in these application domains and beyond, it is crucial to study, interpret, and analyze user perspectives, public opinion, reviews, and feedback related to exoskeletons, for which a dataset is necessary. The Internet of Everything era of today's living, characterized by people spending more time on the Internet than ever before, holds the potential for developing such a dataset by mining relevant web behavior data from social media communications, which have increased exponentially in the last few years. Twitter, one such social media platform, is highly popular amongst all age groups, who communicate on diverse topics including but not limited to news, current events, politics, emerging technologies, family, relationships, and career opportunities, via tweets, while sharing their views, opinions, perspectives, and feedback towards the same. Therefore, this work presents a dataset of about 140,000 Tweets related to exoskeletons. that were mined for a period of 5-years from May 21, 2017, to May 21, 2022. The tweets contain diverse forms of communications and conversations which communicate user interests, user perspectives, public opinion, reviews, feedback, suggestions, etc., related to exoskeletons. Instructions: This dataset contains about 140,000 Tweets related to exoskeletons. that were mined for a period of 5-years from May 21, 2017, to May 21, 2022. The tweets contain diverse forms of communications and conversations which communicate user interests, user perspectives, public opinion, reviews, feedback, suggestions, etc., related to exoskeletons. The dataset contains only tweet identifiers (Tweet IDs) due to the terms and conditions of Twitter to re-distribute Twitter data only for research purposes. They need to be hydrated to be used. The process of retrieving a tweet's complete information (such as the text of the tweet, username, user ID, date and time, etc.) using its ID is known as the hydration of a tweet ID. The Hydrator application (link to download the application: https://github.com/DocNow/hydrator/releases and link to a step-by-step tutorial: https://towardsdatascience.com/learn-how-to-easily-hydrate-tweets-a0f393ed340e#:~:text=Hydrating%20Tweets) or any similar application may be used for hydrating this dataset. Data Description This dataset consists of 7 .txt files. The following shows the number of Tweet IDs and the date range (of the associated tweets) in each of these files. Filename: Exoskeleton_TweetIDs_Set1.txt (Number of Tweet IDs – 22945, Date Range of Tweets - July 20, 2021 – May 21, 2022) Filename: Exoskeleton_TweetIDs_Set2.txt (Number of Tweet IDs – 19416, Date Range of Tweets - Dec 1, 2020 – July 19, 2021) Filename: Exoskeleton_TweetIDs_Set3.txt (Number of Tweet IDs – 16673, Date Range of Tweets - April 29, 2020 - Nov 30, 2020) Filename: Exoskeleton_TweetIDs_Set4.txt (Number of Tweet IDs – 16208, Date Range of Tweets - Oct 5, 2019 - Apr 28, 2020) Filename: Exoskeleton_TweetIDs_Set5.txt (Number of Tweet IDs – 17983, Date Range of Tweets - Feb 13, 2019 - Oct 4, 2019) Filename: Exoskeleton_TweetIDs_Set6.txt (Number of Tweet IDs – 34009, Date Range of Tweets - Nov 9, 2017 - Feb 12, 2019) Filename: Exoskeleton_TweetIDs_Set7.txt (Number of Tweet IDs – 11351, Date Range of Tweets - May 21, 2017 - Nov 8, 2017) Here, the last date for May is May 21 as it was the most recent date at the time of data collection. The dataset would be updated soon to incorporate more recent tweets.
H
Tweets Dataset - Top 20 most followed users in Twitter social platform
dataverse.harvard.edu
Updated Aug 18, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Raad Bin Tareaf (2017). Tweets Dataset - Top 20 most followed users in Twitter social platform [Dataset]. http://doi.org/10.7910/DVN/JBXKFD
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/JBXKFD
Dataset updated
Aug 18, 2017
Dataset provided by
Harvard Dataverse
Authors
Raad Bin Tareaf
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
-This Dataset was gathered by crawling Twitter's REST API using the Python library tweepy 3. This dataset contains the tweets of the 20 most popular twitter users (with the most followers) whereby retweets are neglected. These accounts belong to public people, such as Katy Perry and Barack Obama, platforms, YouTube, Instagram, and television channels shows, e.g., CNN Breaking News and The Ellen Show. -Consequently, the dataset contains a mix of relatively structured tweets, tweets written in a formal and informative manner, and completely unstructured tweets written in a colloquial style. Unfortunately, the geocoordinates were not available for those tweets. - H -This Dataset has been used to generate reserach paper under title "Machine Learning Techniques for Anomalies Detection in Post Arrays". -Crawled attributes are: Author (Twitter User), Content (Tweet), Date_Time, id (Twitter User ID), language (Tweet Langugage), Number_of_Likes, Number_of_Shares. Overall: 52543 tweets of top 20 users in twitter Screen_Name #Tweets Time span (in days) TheEllenShow 3,147 - 662 jimmyfallon 3,123 - 1231 ArianaGrande 3,104 - 613 YouTube 3,077 - 411 KimKardashian 2,939 - 603 katyperry 2,924 - 1,598 selenagomez 2,913 - 2,266 rihanna 2,877 - 1,557 BarackObama 2,863 - 849 britneyspears 2,776 - 1,548 instagram 2,577 - 456 shakira 2,530 - 1,850 Cristiano 2,507 - 2,407 jtimberlake 2,478 - 2,491 ladygaga 2,329 - 894 Twitter 2,290 - 2,593 ddlovato 2,217 - 741 taylorswift13 2,029 - 2,091 justinbieber 2,000 - 664 cnnbrk 1,842 - 183
s
Twitter bot profiling
researchdata.smu.edu.sg
smu.edu.sg
+1more
pdf
Updated May 31, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Living Analytics Research Centre (2023). Twitter bot profiling [Dataset]. http://doi.org/10.25440/smu.12062706.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.25440/smu.12062706.v1
Dataset updated
May 31, 2023
Dataset provided by
SMU Research Data Repository (RDR)
Authors
Living Analytics Research Centre
License
http://rightsstatements.org/vocab/InC/1.0/http://rightsstatements.org/vocab/InC/1.0/
Description
This dataset comprises a set of Twitter accounts in Singapore that are used for social bot profiling research conducted by the Living Analytics Research Centre (LARC) at Singapore Management University (SMU). Here a bot is defined as a Twitter account that generates contents and/or interacts with other users automatically (at least according to human judgment). In this research, Twitter bots have been categorized into three major types:

Broadcast bot. This bot aims at disseminating information to general audience by providing, e.g., benign links to news, blogs or sites. Such bot is often managed by an organization or a group of people (e.g., bloggers). Consumption bot. The main purpose of this bot is to aggregate contents from various sources and/or provide update services (e.g., horoscope reading, weather update) for personal consumption or use. Spam bot. This type of bots posts malicious contents (e.g., to trick people by hijacking certain account or redirecting them to malicious sites), or promotes harmless but invalid/irrelevant contents aggressively.

This categorization is general enough to cater for new, emerging types of bot (e.g., chatbots can be viewed as a special type of broadcast bots). The dataset was collected from 1 January to 30 April 2014 via the Twitter REST and streaming APIs. Starting from popular seed users (i.e., users having many followers), their follow, retweet, and user mention links were crawled. The data collection proceeds by adding those followers/followees, retweet sources, and mentioned users who state Singapore in their profile location. Using this procedure, a total of 159,724 accounts have been collected. To identify bots, the first step is to check active accounts who tweeted at least 15 times within the month of April 2014. These accounts were then manually checked and labelled, of which 589 bots were found. As many more human users are expected in the Twitter population, the remaining accounts were randomly sampled and manually checked. With this, 1,024 human accounts were identified. In total, this results in 1,613 labelled accounts. Related Publication: R. J. Oentaryo, A. Murdopo, P. K. Prasetyo, and E.-P. Lim. (2016). On profiling bots in social media. Proceedings of the International Conference on Social Informatics (SocInfo’16), 92-109. Bellevue, WA. https://doi.org/10.1007/978-3-319-47880-7_6
s
Twitter Users Broken Down By Gender
searchlogistics.com
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Twitter Users Broken Down By Gender [Dataset]. https://www.searchlogistics.com/learn/statistics/twitter-user-statistics/
Explore at:
Dataset updated
Apr 1, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The platform is male-dominated with 68.1% of all Twitter users being male. Just 31.9% of Twitter users are female.
f
Data from: Academic information on Twitter: A user survey
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated May 18, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohammadi, Ehsan; Thelwall, Mike; Holmes, Kristi L.; Kwasny, Mary (2018). Academic information on Twitter: A user survey [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000721082
Explore at:
Dataset updated
May 18, 2018
Authors
Mohammadi, Ehsan; Thelwall, Mike; Holmes, Kristi L.; Kwasny, Mary
Description
Although counts of tweets citing academic papers are used as an informal indicator of interest, little is known about who tweets academic papers and who uses Twitter to find scholarly information. Without knowing this, it is difficult to draw useful conclusions from a publication being frequently tweeted. This study surveyed 1,912 users that have tweeted journal articles to ask about their scholarly-related Twitter uses. Almost half of the respondents (45%) did not work in academia, despite the sample probably being biased towards academics. Twitter was used most by people with a social science or humanities background. People tend to leverage social ties on Twitter to find information rather than searching for relevant tweets. Twitter is used in academia to acquire and share real-time information and to develop connections with others. Motivations for using Twitter vary by discipline, occupation, and employment sector, but not much by gender. These factors also influence the sharing of different types of academic information. This study provides evidence that Twitter plays a significant role in the discovery of scholarly information and cross-disciplinary knowledge spreading. Most importantly, the large numbers of non-academic users support the claims of those using tweet counts as evidence for the non-academic impacts of scholarly research.

Facebook

Twitter

Click to copy link

Link copied

Cite

M Yasser H (2022). Twitter Tweets Sentiment Dataset [Dataset]. https://www.kaggle.com/datasets/yasserh/twitter-tweets-sentiment-dataset

Twitter Tweets Sentiment Dataset

Twitter Tweets Sentiment Analysis for Natural Language Processing

Explore at:

42 scholarly articles cite this dataset (View in Google Scholar)

zip(1289519 bytes)Available download formats

Dataset updated

Apr 8, 2022

Authors

M Yasser H

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

https://raw.githubusercontent.com/Masterx-AI/Project_Twitter_Sentiment_Analysis_/main/twitt.jpg" alt="">

Description:

Twitter is an online Social Media Platform where people share their their though as tweets. It is observed that some people misuse it to tweet hateful content. Twitter is trying to tackle this problem and we shall help it by creating a strong NLP based-classifier model to distinguish the negative tweets & block such tweets. Can you build a strong classifier model to predict the same?

Each row contains the text of a tweet and a sentiment label. In the training set you are provided with a word or phrase drawn from the tweet (selected_text) that encapsulates the provided sentiment.

Make sure, when parsing the CSV, to remove the beginning / ending quotes from the text field, to ensure that you don't include them in your training.

You're attempting to predict the word or phrase from the tweet that exemplifies the provided sentiment. The word or phrase should include all characters within that span (i.e. including commas, spaces, etc.)

Columns:

textID - unique ID for each piece of text
text - the text of the tweet
sentiment - the general sentiment of the tweet

Acknowledgement:

The dataset is download from Kaggle Competetions:
https://www.kaggle.com/c/tweet-sentiment-extraction/data?select=train.csv

Objective:

Understand the Dataset & cleanup (if required).
Build classification models to predict the twitter sentiments.
Compare the evaluation metrics of vaious classification algorithms.

Clear search

Close search

Google apps

Main menu

Twitter Tweets Sentiment Dataset

Description:

Columns:

Acknowledgement:

Objective:

In-Depth Twitter Retweet Analysis Dataset

In-Depth Twitter Retweet Analysis Dataset

Dataset Overview

Dataset Columns

Usage

Applications

Twitter Revenue Growth

Unleashing Social Sentiments: A Twitter Analysis

Top 1000 Twitter Celebrity Tweets And Embeddings

Context

Content

Featured Notebook

Live App

GitHub Project

Download

Disclaimer

Acknowledgement

Tweets and User Engagement

Tweets and User Engagement

Twitter Data: Tweet Characteristics and Engagement Metrics

About this dataset

How to use the dataset

Research Ideas

Twitter users in the United States 2019-2028

Greekgodx Tweets: Analyzing Conversation

Greekgodx Tweets: Analyzing Conversation Interactions on Social Media

Retweets, Likes, and Mentions

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

How to use this dataset

Research Ideas

Acknowledgements

License

Columns

Acknowledgements

Sentiment Analysis on Financial Tweets

Context

Content

Acknowledgements

Inspiration

X/Twitter: Countries with the largest audience 2025

Twitter Key Statistics

Twitter Users Broken down By Country

Twitter Users Statistics 2025: Monthly Active Users, Regional Data & More

Why Do People Use Twitter?

X/Twitter: platform manipulation and spam actions H2 2024

Data from: Twitter Big Data as A Resource For Exoskeleton Research: A...

Tweets Dataset - Top 20 most followed users in Twitter social platform

Twitter bot profiling

Twitter Users Broken Down By Gender

Data from: Academic information on Twitter: A user survey

Twitter Tweets Sentiment Dataset

Twitter Tweets Sentiment Analysis for Natural Language Processing

Description:

Columns:

Acknowledgement:

Objective: