100+ datasets found

X/Twitter: Countries with the largest audience 2025
statista.com
Updated Jun 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). X/Twitter: Countries with the largest audience 2025 [Dataset]. https://www.statista.com/statistics/242606/number-of-active-twitter-users-in-selected-countries/
Explore at:
Dataset updated
Jun 19, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Feb 2025
Area covered
Worldwide
Description
Social network X/Twitter is particularly popular in the United States, and as of February 2025, the microblogging service had an audience reach of 103.9 million users in the country. Japan and the India were ranked second and third with more than 70 million and 25 million users respectively. Global Twitter usage As of the second quarter of 2021, X/Twitter had 206 million monetizable daily active users worldwide. The most-followed Twitter accounts include figures such as Elon Musk, Justin Bieber and former U.S. president Barack Obama. X/Twitter and politics X/Twitter has become an increasingly relevant tool in domestic and international politics. The platform has become a way to promote policies and interact with citizens and other officials, and most world leaders and foreign ministries have an official Twitter account. Former U.S. president Donald Trump used to be a prolific Twitter user before the platform permanently suspended his account in January 2021. During an August 2018 survey, 61 percent of respondents stated that Trump's use of Twitter as President of the United States was inappropriate.
T
Twitter Statistics
searchlogistics.com
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Search Logistics (2025). Twitter Statistics [Dataset]. https://www.searchlogistics.com/learn/statistics/twitter-user-statistics/
Explore at:
Dataset updated
Apr 1, 2025
Dataset authored and provided by
Search Logistics
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
These Twitter user statistics will give you the complete story of where Twitter is at today and what the future looks like for the social media company.
X/Twitter: number of worldwide users 2019-2024
statista.com
Updated Jun 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). X/Twitter: number of worldwide users 2019-2024 [Dataset]. https://www.statista.com/statistics/303681/twitter-users-worldwide/
Explore at:
Dataset updated
Jun 26, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Dec 2022
Area covered
Worldwide
Description
As of December 2022, X/Twitter's audience accounted for over *** million monthly active users worldwide. This figure was projected to ******** to approximately *** million by 2024, a ******* of around **** percent compared to 2022.
s
Twitter Users Broken down By Country
searchlogistics.com
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Twitter Users Broken down By Country [Dataset]. https://www.searchlogistics.com/learn/statistics/twitter-user-statistics/
Explore at:
Dataset updated
Apr 1, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The US has historically been the target country for Twitter since its launch in 2006. This is the full breakdown of Twitter users by country.
Twitter Friends
kaggle.com
Updated Sep 2, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hubert Wassner (2016). Twitter Friends [Dataset]. https://www.kaggle.com/datasets/hwassner/TwitterFriends/discussion?sortBy=recent
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 2, 2016
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Hubert Wassner
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Twitter Friends and hashtags

Context

This datasets is an extract of a wider database aimed at collecting Twitter user's friends (other accound one follows). The global goal is to study user's interest thru who they follow and connection to the hashtag they've used.

Content

It's a list of Twitter user's informations. In the JSON format one twitter user is stored in one object of this more that 40.000 objects list. Each object holds :

avatar : URL to the profile picture

followerCount : the number of followers of this user

friendsCount : the number of people following this user.

friendName : stores the @name (without the '@') of the user (beware this name can be changed by the user)

id : user ID, this number can not change (you can retrieve screen name with this service : https://tweeterid.com/)

friends : the list of IDs the user follows (data stored is IDs of users followed by this user)

lang : the language declared by the user (in this dataset there is only "en" (english))

lastSeen : the time stamp of the date when this user have post his last tweet.

tags : the hashtags (whith or without #) used by the user. It's the "trending topic" the user tweeted about.

tweetID : Id of the last tweet posted by this user.

You also have the CSV format which uses the same naming convention.

These users are selected because they tweeted on Twitter trending topics, I've selected users that have at least 100 followers and following at least 100 other account (in order to filter out spam and non-informative/empty accounts).

Acknowledgements

This data set is build by Hubert Wassner (me) using the Twitter public API. More data can be obtained on request (hubert.wassner AT gmail.com), at this time I've collected over 5 milions in different languages. Some more information can be found here (in french only) : http://wassner.blogspot.fr/2016/06/recuperer-des-profils-twitter-par.html

Past Research

No public research have been done (until now) on this dataset. I made a private application which is described here : http://wassner.blogspot.fr/2016/09/twitter-profiling.html (in French) which uses the full dataset (Millions of full profiles).

Inspiration

On can analyse a lot of stuff with this datasets :

stats about followers & followings

manyfold learning or unsupervised learning from friend list

hashtag prediction from friend list

Contact

Feel free to ask any question (or help request) via Twitter : @hwassner

Enjoy! ;)
X/Twitter: number of monthly active users 2010-2019
statista.com
Updated Sep 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2023). X/Twitter: number of monthly active users 2010-2019 [Dataset]. https://www.statista.com/statistics/282087/number-of-monthly-active-twitter-users/
Explore at:
Dataset updated
Sep 13, 2023
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Worldwide
Description
How many people use X/Twitter?

As of the first quarter of 2019, X/Twitter averaged 330 million monthly active users, a decline from its all-time high of 336 MAU in the first quarter of 2018. As of the first quarter of 2019, the company switched its user reporting metric to monetizable daily active users (mDAU).

X/Twitter

X/Twitter is a social networking and microblogging service, enabling registered users to read and post short messages called tweets. X/Twitter messages are limited to 280 characters and users are also able to upload photos or short videos. Tweets are posted to a publicly available profile or can be sent as direct messages to other users.

Part of the social platform’s appeal is the ability of users to follow any other user with a public profile, enabling users to interact with celebrities who regularly post on the social media site. Currently, the most-followed person on Twitter is singer Katy Perry with more than 107 million followers. Twitter has also become an important communications channel for governments and heads of state – U.S. President Donald Trump was the most-followed world leader on Twitter, followed by Pope Francis and Indian Prime Minister Narendra Modi.

Despite the widespread usage among the rich and famous, the decline in active users has not been impressing investors as the platform is largely reliant on delivering advertising to users in order to generate revenues. Twitter’s company revenue in 2018 amounted to three billion U.S. dollars, up from 2.44 billion in the preceding fiscal year. Twitter was only recently able to report a positive annual result for the first time, when the company generated 1.2 billion U.S. dollars in net income in 2018.
s
Twitter Users Broken Down By Age
searchlogistics.com
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Twitter Users Broken Down By Age [Dataset]. https://www.searchlogistics.com/learn/statistics/twitter-user-statistics/
Explore at:
Dataset updated
Apr 1, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the breakdown of Twitter users by age group.
X/Twitter: distribution of global audiences 2025, by gender
statista.com
Updated Jun 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). X/Twitter: distribution of global audiences 2025, by gender [Dataset]. https://www.statista.com/statistics/828092/distribution-of-users-on-twitter-worldwide-gender/
Explore at:
Dataset updated
Jun 19, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Feb 2025
Area covered
Worldwide
Description
As of February 2025, micro-blogging platform X (formerly Twitter) was more popular with men than women, with male audiences accounting for 63.7 percent of global users. Additionally, users between the ages of 25 and 34 were particularly active on X/Twitter, making up more than 37 percent of users worldwide. How many people use? Although X/Twitter holds its status as a mainstream social media site, it falls short in comparison to other well-known platforms in terms of user numbers. As of early 2022, X/Twitter had around 436 million monthly active users, whilst Meta’s Facebook reached almost three billion MAU. Overall, the United States is home to over 105 million X/Twitter users, making up Twitter’s largest audience base, followed by Japan, India, and the United Kingdom, respectively. How is Twitter used? X/Twitter is utilized by its audience for many different purposes. In May 2021, over 80 percent of high-volume X/Twitter users (defined as users who tweet around 20 times per month) in the United States reported using the platform for entertainment, whilst 78 percent said they used it as a way to stay informed. High-volume X/Twitter users were far more likely to use the service as a means of expressing their opinion. Furthermore, in 2022, over half of social media users in the U.S. used Twitter as a news resource.  
Following/Followers and Tags on 0.1 million Twitter Users
zenodo.org
data.niaid.nih.gov
application/gzip
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mitsuo Yoshida; Yuto Yamaguchi; Mitsuo Yoshida; Yuto Yamaguchi (2020). Following/Followers and Tags on 0.1 million Twitter Users [Dataset]. http://doi.org/10.5281/zenodo.13966
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.13966
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Mitsuo Yoshida; Yuto Yamaguchi; Mitsuo Yoshida; Yuto Yamaguchi
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Abstract (our paper)

Why does Smith follow Johnson on Twitter? In most cases, the reason why users follow other users is unavailable. In this work, we answer this question by proposing TagF, which analyzes the who-follows-whom network (matrix) and the who-tags-whom network (tensor) simultaneously. Concretely, our method decomposes a coupled tensor constructed from these matrix and tensor. The experimental results on million-scale Twitter networks show that TagF uncovers different, but explainable reasons why users follow other users.

Data

coupled_tensor:
The first column is the source user id (from user id), the second column is the destination user id (to user id), and the third column is the tag id.

users.id:
The first column is the user id for coupled_tensor, and the second column is the user id on Twitter.

tags.id:
The first column is the tag id for coupled_tensor, and the second column is the tag (i.e. slug or list name) on Twitter. On the tags, ###follow### and ###friend### are special tags expressing follower and following.

Publication

This dataset was created for our study. If you make use of this dataset, please cite:
Yuto Yamaguchi, Mitsuo Yoshida, Christos Faloutsos, Hiroyuki Kitagawa. Why Do You Follow Him? Multilinear Analysis on Twitter. Proceedings of the 24th International Conference on World Wide Web (WWW '15 Companion). pp.137-138, 2015.
http://doi.org/10.1145/2740908.2742715

Code

Our code outputting experiment results made available at:
https://github.com/yamaguchiyuto/tagf

Note

If you would like to use larger dataset, the dataset on 1 million seed users made available at:
http://dx.doi.org/10.5281/zenodo.16267
(The dataset on 0.1 million seed users is not subset of the dataset on 1 million seed users.)
g
Just Another Day on Twitter: A Complete 24 Hours of Twitter Data
search.gesis.org
Updated Oct 16, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pfeffer, Jürgen (2022). Just Another Day on Twitter: A Complete 24 Hours of Twitter Data [Dataset]. https://search.gesis.org/research_data/SDN-10.7802-2516
Explore at:
Dataset updated
Oct 16, 2022
Dataset provided by
GESIS, Köln
GESIS search
Authors
Pfeffer, Jürgen
License
https://www.gesis.org/en/institute/data-usage-termshttps://www.gesis.org/en/institute/data-usage-terms
Description
At the end of October 2022, Elon Musk concluded his acquisition of Twitter. In the weeks and months before that, several questions were publicly discussed that were not only of interest to the platform's future buyers, but also of high relevance to the Computational Social Science research community. For example, how many active users does the platform have? What percentage of accounts on the site are bots? And, what are the dominating topics and sub-topical spheres on the platform? In a globally coordinated effort of 80 scholars to shed light on these questions, and to offer a dataset that will equip other researchers to do the same, we have collected 375 million tweets published within a 24-hour time period starting on September 21, 2022. To the best of our knowledge, this is the first complete 24-hour Twitter dataset that is available for the research community. With it, the present work aims to accomplish two goals. First, we seek to answer the aforementioned questions and provide descriptive metrics about Twitter that can serve as references for other researchers. Second, we create a baseline dataset for future research that can be used to study the potential impact of the platform's ownership change.
World leaders with the most Twitter followers 2020
statista.com
Updated Apr 28, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2022). World leaders with the most Twitter followers 2020 [Dataset]. https://www.statista.com/statistics/281375/heads-of-state-with-the-most-twitter-followers/
Explore at:
Dataset updated
Apr 28, 2022
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Jun 1, 2020
Area covered
Worldwide
Description
In 2020, 189 countries were represented through an official presence on Twitter, either by personal or institutional accounts run by heads of state and government and foreign ministers. During the measured period, U.S. President Donald Trump was ranked first, having accumulated over 81.1 million Twitter followers on his personal account. The official @POTUS account was ranked fifth with 30.2 million followers worldwide. Heads of state on Twitter Twitter is a very conversational social platform, allowing users to communicate in a very public manner. Foreign ministries utilize Twitter to expand their online presence and digital diplomatic networks, and government officials are encouraged to interact with the public. The most conversational world leader on Twitter is the Government of Nepal, with 96 percent of their tweets being @ replies to other Twitter users. Another more subtle layer of Twitter diplomacy is the mutual following of peers between official heads of state, minister and other government accounts – as of June 2020, the Foreign Ministry of Iceland (@MFAIceland) was ranked first, having 147 mutual connections with other world leaders and foreign ministries on Twitter. During the measured period, @realDonaldTrump, @POTUS and the @WhiteHouse Twitter accounts did not follow any other foreign leaders. In 2018, the account of the U.S. State Department had only 59 mutual peer connections on Twitter, painting a relatively isolated picture in terms of international political communications. Trump on Twitter Donald Trump’s prolific Twitter usage is a hotly debated topic. The President uses Twitter on a daily basis to make comments about other politicians, celebrities and daily news, sometimes antagonizing others with his controversial statements. According to an August 2018 survey, 61 percent of U.S. adults stated that Trump's use of Twitter as President of the United States was inappropriate, while only 24 percent of respondents said the opposite. In total, 90 percent of respondents who identified as Democrats thought that Trump's Twitter use was inappropriate; while on the other end of the political spectrum only 35 percent of respondents identifying as Republicans reported having the same opinion.
s
Twitter Users Broken Down By Gender
searchlogistics.com
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Twitter Users Broken Down By Gender [Dataset]. https://www.searchlogistics.com/learn/statistics/twitter-user-statistics/
Explore at:
Dataset updated
Apr 1, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The platform is male-dominated with 68.1% of all Twitter users being male. Just 31.9% of Twitter users are female.
f
101 Twitter users
figshare.com
application/x-rar
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hsien-Tsung Chang; Clief Hendro Sengkey; Minh-Khoi Le (2023). 101 Twitter users [Dataset]. http://doi.org/10.6084/m9.figshare.12643865.v2
Explore at:
application/x-rarAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12643865.v2
Dataset updated
May 31, 2023
Dataset provided by
figshare
Authors
Hsien-Tsung Chang; Clief Hendro Sengkey; Minh-Khoi Le
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We collected the data of a Twitter user using Tweepy to access the Twitter API. We crawled the list of each user account’s followers. Twitter allowed a request of a maximum of 200 tweets per time window and because of limitations of the Twitter API, we could only make a request every 15 minutes. Next, we obtained the most recent tweets of each user in the study. We extracted the most common hashtags used in the sample tweets and crawled the most recent 50 tweets that contained each hashtag and tweets that mentioned a particular user, for example ’@username.’ Initially, we chose 101 user accounts and documented the attributes of each user’s account (number of followers, a list of followers, and the recent tweets of each follower).
u
Data from: Google Analytics & Twitter dataset from a movies, TV series and...
portalcientificovalencia.univeuropea.com
figshare.com
Updated 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yeste, Víctor; Yeste, Víctor (2024). Google Analytics & Twitter dataset from a movies, TV series and videogames website [Dataset]. https://portalcientificovalencia.univeuropea.com/documentos/67321ed3aea56d4af0485dc8
Explore at:
Dataset updated
2024
Authors
Yeste, Víctor; Yeste, Víctor
Description
Author: Víctor Yeste. Universitat Politècnica de Valencia.The object of this study is the design of a cybermetric methodology whose objectives are to measure the success of the content published in online media and the possible prediction of the selected success variables.In this case, due to the need to integrate data from two separate areas, such as web publishing and the analysis of their shares and related topics on Twitter, has opted for programming as you access both the Google Analytics v4 reporting API and Twitter Standard API, always respecting the limits of these.The website analyzed is hellofriki.com. It is an online media whose primary intention is to solve the need for information on some topics that provide daily a vast number of news in the form of news, as well as the possibility of analysis, reports, interviews, and many other information formats. All these contents are under the scope of the sections of cinema, series, video games, literature, and comics.This dataset has contributed to the elaboration of the PhD Thesis:Yeste Moreno, VM. (2021). Diseño de una metodología cibermétrica de cálculo del éxito para la optimización de contenidos web [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/176009Data have been obtained from each last-minute news article published online according to the indicators described in the doctoral thesis. All related data are stored in a database, divided into the following tables:tesis_followers: User ID list of media account followers.tesis_hometimeline: data from tweets posted by the media account sharing breaking news from the web.status_id: Tweet IDcreated_at: date of publicationtext: content of the tweetpath: URL extracted after processing the shortened URL in textpost_shared: Article ID in WordPress that is being sharedretweet_count: number of retweetsfavorite_count: number of favoritestesis_hometimeline_other: data from tweets posted by the media account that do not share breaking news from the web. Other typologies, automatic Facebook shares, custom tweets without link to an article, etc. With the same fields as tesis_hometimeline.tesis_posts: data of articles published by the web and processed for some analysis.stats_id: Analysis IDpost_id: Article ID in WordPresspost_date: article publication date in WordPresspost_title: title of the articlepath: URL of the article in the middle webtags: Tags ID or WordPress tags related to the articleuniquepageviews: unique page viewsentrancerate: input ratioavgtimeonpage: average visit timeexitrate: output ratiopageviewspersession: page views per sessionadsense_adunitsviewed: number of ads viewed by usersadsense_viewableimpressionpercent: ad display ratioadsense_ctr: ad click ratioadsense_ecpm: estimated ad revenue per 1000 page viewstesis_stats: data from a particular analysis, performed at each published breaking news item. Fields with statistical values can be computed from the data in the other tables, but total and average calculations are saved for faster and easier further processing.id: ID of the analysisphase: phase of the thesis in which analysis has been carried out (right now all are 1)time: "0" if at the time of publication, "1" if 14 days laterstart_date: date and time of measurement on the day of publicationend_date: date and time when the measurement is made 14 days latermain_post_id: ID of the published article to be analysedmain_post_theme: Main section of the published article to analyzesuperheroes_theme: "1" if about superheroes, "0" if nottrailer_theme: "1" if trailer, "0" if notname: empty field, possibility to add a custom name manuallynotes: empty field, possibility to add personalized notes manually, as if some tag has been removed manually for being considered too generic, despite the fact that the editor put itnum_articles: number of articles analysednum_articles_with_traffic: number of articles analysed with traffic (which will be taken into account for traffic analysis)num_articles_with_tw_data: number of articles with data from when they were shared on the media’s Twitter accountnum_terms: number of terms analyzeduniquepageviews_total: total page viewsuniquepageviews_mean: average page viewsentrancerate_mean: average input ratioavgtimeonpage_mean: average duration of visitsexitrate_mean: average output ratiopageviewspersession_mean: average page views per sessiontotal: total of ads viewedadsense_adunitsviewed_mean: average of ads viewedadsense_viewableimpressionpercent_mean: average ad display ratioadsense_ctr_mean: average ad click ratioadsense_ecpm_mean: estimated ad revenue per 1000 page viewsTotal: total incomeretweet_count_mean: average incomefavorite_count_total: total of favoritesfavorite_count_mean: average of favoritesterms_ini_num_tweets: total tweets on the terms on the day of publicationterms_ini_retweet_count_total: total retweets on the terms on the day of publicationterms_ini_retweet_count_mean: average retweets on the terms on the day of publicationterms_ini_favorite_count_total: total of favorites on the terms on the day of publicationterms_ini_favorite_count_mean: average of favorites on the terms on the day of publicationterms_ini_followers_talking_rate: ratio of followers of the media Twitter account who have recently published a tweet talking about the terms on the day of publicationterms_ini_user_num_followers_mean: average followers of users who have spoken of the terms on the day of publicationterms_ini_user_num_tweets_mean: average number of tweets published by users who spoke about the terms on the day of publicationterms_ini_user_age_mean: average age in days of users who have spoken of the terms on the day of publicationterms_ini_ur_inclusion_rate: URL inclusion ratio of tweets talking about terms on the day of publicationterms_end_num_tweets: total tweets on terms 14 days after publicationterms_ini_retweet_count_total: total retweets on terms 14 days after publicationterms_ini_retweet_count_mean: average retweets on terms 14 days after publicationterms_ini_favorite_count_total: total bookmarks on terms 14 days after publicationterms_ini_favorite_count_mean: average of favorites on terms 14 days after publicationterms_ini_followers_talking_rate: ratio of media Twitter account followers who have recently posted a tweet talking about the terms 14 days after publicationterms_ini_user_num_followers_mean: average followers of users who have spoken of the terms 14 days after publicationterms_ini_user_num_tweets_mean: average number of tweets published by users who have spoken about the terms 14 days after publicationterms_ini_user_age_mean: the average age in days of users who have spoken of the terms 14 days after publicationterms_ini_ur_inclusion_rate: URL inclusion ratio of tweets talking about terms 14 days after publication.tesis_terms: data of the terms (tags) related to the processed articles.stats_id: Analysis IDtime: "0" if at the time of publication, "1" if 14 days laterterm_id: Term ID (tag) in WordPressname: Name of the termslug: URL of the termnum_tweets: number of tweetsretweet_count_total: total retweetsretweet_count_mean: average retweetsfavorite_count_total: total of favoritesfavorite_count_mean: average of favoritesfollowers_talking_rate: ratio of followers of the media Twitter account who have recently published a tweet talking about the termuser_num_followers_mean: average followers of users who were talking about the termuser_num_tweets_mean: average number of tweets published by users who were talking about the termuser_age_mean: average age in days of users who were talking about the termurl_inclusion_rate: URL inclusion ratio
Famous Words Twitter Dataset
kaggle.com
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
_w1998 (2023). Famous Words Twitter Dataset [Dataset]. https://www.kaggle.com/datasets/jackksoncsie/twitter-dataset-keywords-likes-and-tweets/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 30, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
_w1998
License
http://www.gnu.org/licenses/agpl-3.0.htmlhttp://www.gnu.org/licenses/agpl-3.0.html
Description
The Famous Words Twitter Dataset is a comprehensive collection of tweets associated with famous words. The dataset provides valuable insights into the social media engagement and popularity of these words on the Twitter platform. It includes three primary columns: keyword, likes, and tweets.

The keyword column represents the specific famous word or phrase associated with each tweet. It allows researchers and analysts to explore the dynamics of user interactions and discussions surrounding these popular terms on Twitter.

The likes column indicates the number of likes received by each tweet. This metric serves as an indicator of the tweet's popularity and resonation among Twitter users.

The tweet column contains the actual tweet text, capturing the content and context of user-generated messages related to the famous words. This column provides valuable qualitative data for sentiment analysis, topic modeling, and other natural language processing tasks.

Researchers, data scientists, and social media analysts can leverage this dataset to study various aspects, such as tracking trends, sentiment analysis, understanding user engagement patterns, and identifying influential topics associated with famous words on Twitter.

Topics: "COVID-19", "Vaccine", "Zoom", "Bitcoin", "Dogecoin", "NFT", "Elon Musk", "Tesla", "Amazon", "iPhone 12", "Remote work", "TikTok", "Instagram", "Facebook", "YouTube", "Netflix", "GameStop", "Super Bowl", "Olympics", "Black Lives Matter" "India vs England", "Ukraine", "Queen Elizabeth", "World Cup", "Jeffrey Dahmer", "Johnny Depp", "Will Smith", "Weather", "xvideo", "porn", "nba", "Macdonald",

Total has 128837 tweets, and here are the plot for each number of tweets for different keyword

https://i.imgur.com/z4xbbyt.png" alt="">

Note: The dataset is carefully curated, anonymized, and stripped of any personally identifiable information to protect user privacy.
f
Predicting age groups of Twitter users based on language and metadata...
plos.figshare.com
datasetcatalog.nlm.nih.gov
docx
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Antonio A. Morgan-Lopez; Annice E. Kim; Robert F. Chew; Paul Ruddle (2023). Predicting age groups of Twitter users based on language and metadata features [Dataset]. http://doi.org/10.1371/journal.pone.0183537
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0183537
Dataset updated
Jun 1, 2023
Dataset provided by
PLOS ONE
Authors
Antonio A. Morgan-Lopez; Annice E. Kim; Robert F. Chew; Paul Ruddle
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Health organizations are increasingly using social media, such as Twitter, to disseminate health messages to target audiences. Determining the extent to which the target audience (e.g., age groups) was reached is critical to evaluating the impact of social media education campaigns. The main objective of this study was to examine the separate and joint predictive validity of linguistic and metadata features in predicting the age of Twitter users. We created a labeled dataset of Twitter users across different age groups (youth, young adults, adults) by collecting publicly available birthday announcement tweets using the Twitter Search application programming interface. We manually reviewed results and, for each age-labeled handle, collected the 200 most recent publicly available tweets and user handles’ metadata. The labeled data were split into training and test datasets. We created separate models to examine the predictive validity of language features only, metadata features only, language and metadata features, and words/phrases from another age-validated dataset. We estimated accuracy, precision, recall, and F1 metrics for each model. An L1-regularized logistic regression model was conducted for each age group, and predicted probabilities between the training and test sets were compared for each age group. Cohen’s d effect sizes were calculated to examine the relative importance of significant features. Models containing both Tweet language features and metadata features performed the best (74% precision, 74% recall, 74% F1) while the model containing only Twitter metadata features were least accurate (58% precision, 60% recall, and 57% F1 score). Top predictive features included use of terms such as “school” for youth and “college” for young adults. Overall, it was more challenging to predict older adults accurately. These results suggest that examining linguistic and Twitter metadata features to predict youth and young adult Twitter users may be helpful for informing public health surveillance and evaluation research.
Data from: Trust and Believe – Should We? Evaluating the Trustworthiness of...
zenodo.org
bin, csv +2
Updated Aug 22, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tanveer Khan; Tanveer Khan (2022). Trust and Believe – Should We? Evaluating the Trustworthiness of Twitter Users [Dataset]. http://doi.org/10.5281/zenodo.6964059
Explore at:
text/x-python, txt, csv, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.6964059
Dataset updated
Aug 22, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Tanveer Khan; Tanveer Khan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Trust and Believe – Should We? Evaluating the Trustworthiness of Twitter Users

This model is used to analyze the Twitter users and assigns a score calculated based on their social profiles, the credibility of his tweets, the h-indexing score of the tweets. Users with a higher score are not only considered as more influential but also their tweets are considered to have greater credibility. The model is based on both the user level and content level features of a Twitter user. The details for feature extraction and calculating the Influence score is given in the paper.

Description
To extract the features from Twitter and generate the dataset we used Python. A modAL framework is used to randomly selects ambiguous data points from the unlabeled data pool using three different sampling techniques and the human manually annotates the selected data. We generate a dataset for 50000 Twitter users and then used different classifiers to classify the Twitter user either as Trusted or Untrusted.

Organization
The project consists of the following files:

Dataset.csv
The dataset consists of different features of 50000 Twitter users (Politicians) without labels.

Manually_labeled-Dataset.csv
This CSV file contains all those Twitter users classified manually as Trusted or Untrusted

feature_extraction.py
This python script is used to calculate the Influence score of a Twitter user and further used to generate a dataset. The Influence score is based on:

- Social reputation of the user
- Content score of the tweets
- Tweets credibility
- Index score for the number of re-tweets and likes

Activelearner.ipynb
To classify a large pool of unlabeled data, we used an active learning model (ModAL Framework). A semi-supervised learning algorithm ideal for a situation in which the unlabeled data is abundant but manual labeling is expensive. The active learner randomly selects ambiguous data points from the unlabeled data pool using three different sampling techniques and the human manually annotates the selected data. Further, we use four different classifiers (Support Vector Machine, Logistic Regression, Multilayer Perceptron and Random Forest) to classify the Twitter user as either Trusted Or Untrusted.

twitter_reputation.ipynb
We used different regression models to test its performance on our generated dataset (It is only for testing, now no more part of our work). We train and evaluate our models using different regression models.
Training and testing three regression models:
1. Multilayer perceptron
2. Deep neural network
3. Linear regression

twitter_credentials.py
In order to extract the features of Twitter users first, one need to authenticate by providing the credentials given in this file.

Screen names (Screen_name_1.txt, Screen_name_2.txt, Screen_name_3.txt)
These text files consist of all the Twitter user screen_names. All of them are politicians. We remove the names of all those politicians whose accounts are private. In addition, all those politicians who have no followers/followings are not on the list are also removed. The text of the tweets are not saved. Furthermore, we also remove duplicate names.

References
[1] https://stackoverflow.com/questions/38881314/twitter-data-to-csv-getting-error-when-trying-to-add-to-csv-file

[2] https://stackoverflow.com/questions/48157259/python-tweepy-api-user-timeline-for-list-of-multiple-users-error

[3] https://gallery.azure.ai/Notebook/Computing-Influence-Score-for-Twitter-Users-1

[4] https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html

[5] https://towardsdatascience.com/deep-neural-networks-for-regression-problems-81321897ca33
s
What Are The Most Popular Twitter Accounts?
searchlogistics.com
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). What Are The Most Popular Twitter Accounts? [Dataset]. https://www.searchlogistics.com/learn/statistics/twitter-user-statistics/
Explore at:
Dataset updated
Apr 1, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
With over 611 million monthly active users, building a huge Twitter following is not an easy task. These are the top 25 accounts with the most followers on Twitter right now.
Z
Data from: IA Tweets Analysis Dataset (Spanish)
data.niaid.nih.gov
Updated Aug 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Muñoz, Andrés (2024). IA Tweets Analysis Dataset (Spanish) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10821484
Explore at:
Dataset updated
Aug 3, 2024
Dataset provided by
Muñoz, Andrés
Guerrero-Contreras, Gabriel
Balderas-Díaz, Sara
Serrano-Fernández, Alejandro
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
General Description

This dataset comprises 4,038 tweets in Spanish, related to discussions about artificial intelligence (AI), and was created and utilized in the publication "Enhancing Sentiment Analysis on Social Media: Integrating Text and Metadata for Refined Insights," (10.1109/IE61493.2024.10599899) presented at the 20th International Conference on Intelligent Environments. It is designed to support research on public perception, sentiment, and engagement with AI topics on social media from a Spanish-speaking perspective. Each entry includes detailed annotations covering sentiment analysis, user engagement metrics, and user profile characteristics, among others.

Data Collection Method

Tweets were gathered through the Twitter API v1.1 by targeting keywords and hashtags associated with artificial intelligence, focusing specifically on content in Spanish. The dataset captures a wide array of discussions, offering a holistic view of the Spanish-speaking public's sentiment towards AI.

Dataset Content

ID: A unique identifier for each tweet.

text: The textual content of the tweet. It is a string with a maximum allowed length of 280 characters.

polarity: The tweet's sentiment polarity (e.g., Positive, Negative, Neutral).

favorite_count: Indicates how many times the tweet has been liked by Twitter users. It is a non-negative integer.

retweet_count: The number of times this tweet has been retweeted. It is a non-negative integer.

user_verified: When true, indicates that the user has a verified account, which helps the public recognize the authenticity of accounts of public interest. It is a boolean data type with two allowed values: True or False.

user_default_profile: When true, indicates that the user has not altered the theme or background of their user profile. It is a boolean data type with two allowed values: True or False.

user_has_extended_profile: When true, indicates that the user has an extended profile. An extended profile on Twitter allows users to provide more detailed information about themselves, such as an extended biography, a header image, details about their location, website, and other additional data. It is a boolean data type with two allowed values: True or False.

user_followers_count: The current number of followers the account has. It is a non-negative integer.

user_friends_count: The number of users that the account is following. It is a non-negative integer.

user_favourites_count: The number of tweets this user has liked since the account was created. It is a non-negative integer.

user_statuses_count: The number of tweets (including retweets) posted by the user. It is a non-negative integer.

user_protected: When true, indicates that this user has chosen to protect their tweets, meaning their tweets are not publicly visible without their permission. It is a boolean data type with two allowed values: True or False.

user_is_translator: When true, indicates that the user posting the tweet is a verified translator on Twitter. This means they have been recognized and validated by the platform as translators of content in different languages. It is a boolean data type with two allowed values: True or False.

Cite as

Guerrero-Contreras, G., Balderas-Díaz, S., Serrano-Fernández, A., & Muñoz, A. (2024, June). Enhancing Sentiment Analysis on Social Media: Integrating Text and Metadata for Refined Insights. In 2024 International Conference on Intelligent Environments (IE) (pp. 62-69). IEEE.

Potential Use Cases

This dataset is aimed at academic researchers and practitioners with interests in:

Sentiment analysis and natural language processing (NLP) with a focus on AI discussions in the Spanish language.

Social media analysis on public engagement and perception of artificial intelligence among Spanish speakers.

Exploring correlations between user engagement metrics and sentiment in discussions about AI.

Data Format and File Type

The dataset is provided in CSV format, ensuring compatibility with a wide range of data analysis tools and programming environments.

License

The dataset is available under the Creative Commons Attribution 4.0 International (CC BY 4.0) license, permitting sharing, copying, distribution, transmission, and adaptation of the work for any purpose, including commercial, provided proper attribution is given.
X/Twitter: distribution of global audiences 2025, by age and gender
statista.com
Updated Jun 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). X/Twitter: distribution of global audiences 2025, by age and gender [Dataset]. https://www.statista.com/statistics/1498204/distribution-of-users-on-twitter-worldwide-age-and-gender/
Explore at:
Dataset updated
Jun 19, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Feb 2025
Area covered
Worldwide
Description
As of February 2025, 24.5 percent of X (formerly Twitter) users were men aged between 25 and 34 years. Overall, almost 19 percent of users were men aged between 18 and 24 years. X has a high share of male users when compared to other popular social media platforms.

Facebook

Twitter

Click to copy link

Link copied

Cite

Statista (2025). X/Twitter: Countries with the largest audience 2025 [Dataset]. https://www.statista.com/statistics/242606/number-of-active-twitter-users-in-selected-countries/

X/Twitter: Countries with the largest audience 2025

Explore at:

Dataset updated

Jun 19, 2025

Dataset authored and provided by

Statistahttp://statista.com/

Time period covered

Feb 2025

Area covered

Worldwide

Description

Social network X/Twitter is particularly popular in the United States, and as of February 2025, the microblogging service had an audience reach of 103.9 million users in the country. Japan and the India were ranked second and third with more than 70 million and 25 million users respectively. Global Twitter usage As of the second quarter of 2021, X/Twitter had 206 million monetizable daily active users worldwide. The most-followed Twitter accounts include figures such as Elon Musk, Justin Bieber and former U.S. president Barack Obama. X/Twitter and politics X/Twitter has become an increasingly relevant tool in domestic and international politics. The platform has become a way to promote policies and interact with citizens and other officials, and most world leaders and foreign ministries have an official Twitter account. Former U.S. president Donald Trump used to be a prolific Twitter user before the platform permanently suspended his account in January 2021. During an August 2018 survey, 61 percent of respondents stated that Trump's use of Twitter as President of the United States was inappropriate.

Clear search

Close search

Google apps

Main menu

X/Twitter: Countries with the largest audience 2025

Twitter Statistics

X/Twitter: number of worldwide users 2019-2024

Twitter Users Broken down By Country

Twitter Friends

Twitter Friends and hashtags

Context

Content

Acknowledgements

Past Research

Inspiration

Contact

X/Twitter: number of monthly active users 2010-2019

Twitter Users Broken Down By Age

X/Twitter: distribution of global audiences 2025, by gender

Following/Followers and Tags on 0.1 million Twitter Users

Just Another Day on Twitter: A Complete 24 Hours of Twitter Data

World leaders with the most Twitter followers 2020

Twitter Users Broken Down By Gender

101 Twitter users

Data from: Google Analytics & Twitter dataset from a movies, TV series and...

Famous Words Twitter Dataset

Predicting age groups of Twitter users based on language and metadata...

Data from: Trust and Believe – Should We? Evaluating the Trustworthiness of...

What Are The Most Popular Twitter Accounts?

Data from: IA Tweets Analysis Dataset (Spanish)

X/Twitter: distribution of global audiences 2025, by age and gender

X/Twitter: Countries with the largest audience 2025