As of February 2025, micro-blogging platform X (formerly Twitter) was more popular with men than women, with male audiences accounting for 63.7 percent of global users. Additionally, users between the ages of 25 and 34 were particularly active on X/Twitter, making up more than 37 percent of users worldwide. How many people use? Although X/Twitter holds its status as a mainstream social media site, it falls short in comparison to other well-known platforms in terms of user numbers. As of early 2022, X/Twitter had around 436 million monthly active users, whilst Meta’s Facebook reached almost three billion MAU. Overall, the United States is home to over 105 million X/Twitter users, making up Twitter’s largest audience base, followed by Japan, India, and the United Kingdom, respectively. How is Twitter used? X/Twitter is utilized by its audience for many different purposes. In May 2021, over 80 percent of high-volume X/Twitter users (defined as users who tweet around 20 times per month) in the United States reported using the platform for entertainment, whilst 78 percent said they used it as a way to stay informed. High-volume X/Twitter users were far more likely to use the service as a means of expressing their opinion. Furthermore, in 2022, over half of social media users in the U.S. used Twitter as a news resource.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The platform is male-dominated with 68.1% of all Twitter users being male. Just 31.9% of Twitter users are female.
As of February 2025, 37.5 percent of X’s (formerly Twitter) global audience was aged between 25 and 34 years. The second-largest age group demographic on the platform was represented by users aged between 18 and 24 years, with a share of 32.1 percent. Users aged less than 18 years accounted for two percent of users, while those aged 50 or older accounted for roughly 7.3 percent. X is a male-dominated platform As of January 2024, more than 60 percent of X users were male. Although all mainstream social media platforms tend to have a slightly more male-skewing audience, X stands out above Instagram, Snapchat, TikTok, and Facebook when it comes to user gender demographics. Overall, Pinterest is the only mainstream platform to have a higher share of female users. X Blue for you It is not uncommon for social media users to now have the chance to become subscribers of their chosen online networks for a monthly fee. X Blue is a subscription service from X that gives users special benefits and features. A blue verification mark, edit post functionality, fewer ads, priority ranking in chats, and longer video upload times are some of the perks offered.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the breakdown of Twitter users by age group.
In 2022, singer Mexican-Spanish singer Belinda Peregrín, simply known as Belinda, was Spain's most popular female account on Twitter in Spain, with more than 6.8 million followers. Repeating her success on Instagram and Twitter, singer Rosalía ranked second on the micro-blogging platform in the country, with more than 4.3 million followers. While football has dominated Twitter's sphere in Spain, the most popular female handles on the platform belonged to women related to communication, culture, and content creation activities, with more variety when compared to the male-dominated sphere of the most followed accounts.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These are the key Twitter user statistics that you need to know.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These Twitter user statistics will give you the complete story of where Twitter is at today and what the future looks like for the social media company.
As of February 2025, X/Twitter was the social network with the highest share of male users, who accounted for 63.7 percent of global users. Overall, social media platforms were had more male users than female users.
Synopsis
The datasets contain three files: a follower-feeds.ndjson
as input, a labels.ndjson
as output, and a celebrity-feeds.ndjson
for additional study. Each file lists all celebrities as JSON objects, one per line and identified by the id
key. The training dataset contains 1,920 celebrities and is balanced towards gender and occupation. The supplement dataset contains the remaining 8,265 celebrities but is not balanced in any way.
The follower-feeds.ndjson
contains the English tweets of at least 10 followers for each celebrity, with at least 50 tweets each excluding retweets.
{"id": 1234, "text": [["a tweet of follower 1", "another tweet of follower 1", ...], ["a tweet of follower 2", ...], ...]}
{"id": 5678, "text": [["a tweet of follower 1", "another tweet of follower 1", ...], ["a tweet of follower 2", ...], ...]}
The celebrity-feeds.ndjson
contains the Twitter timelines of the original celebrities, formatted as:
{"id": 1234, "text": ["a tweet of celebrity 1", "another tweet of celebrity 1", ...]}
{"id": 5678, "text": ["a tweet of celebrity 2", "another tweet", ...]}
The labels.ndjson
contains the classes that should be predicted. A valid submission has to produce a labels.ndjson
given the follower-feeds.ndjson
and contain an entry for each id
given in the input.
{"id": 1234, "occupation": "sports", "gender": "female", "birthyear": 2002}
{"id": 5678, "occupation": "professional", "gender": "male", "birthyear": 1990}
The following values are possible for each of the traits:
occupation := {sports, performer, creator, politics}
birthyear := {1940, ..., 1999}
gender := {male, female}
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The US has historically been the target country for Twitter since its launch in 2006. This is the full breakdown of Twitter users by country.
Paper: https://webis.de/publications.html?q=wiegmann_2019a
Source Dataset: https://files.webis.de/data-in-progress/data-research/social-media-analysis/acl19-celebrity-profiling/
Celebrities are among the most prolific users of social media, promoting their personas and rallying followers. This activity is closely tied to genuine writing samples, rendering them worthy research subjects in many respects, not least author profiling.
The Celebrity Profiling task this year is to predict four traits of a celebrity from their social media communication. The traits are the degree of fame, occupation, age, and gender. The social media communication is given as the teaser messages from past tweets. The goal is to develop a piece of software which predicts celebrity traits from the teaser history.
The training dataset contains two files: a feeds.ndjson as input and a labels.ndjson as output. Each file lists all celebrities as JSON objects, one per line and identified by the id key.
The input file contains the cid and a list of all teaser messages for each celebrity.
{"id": 1234, "text": ["a tweet", "another tweet", ...]}
The output file contains the cid and a value for each trait for each celebrity from the input file.
{"id": 1234, "fame": "star", "occupation": "sports", "gender": "female", "birthyear": 2002}
The following values are possible for each of the traits:
fame := {rising, star, superstar} occupation := {sports, performer, creator, politics, manager, science, professional, religious} birthyear := {1940, ..., 2012} gender := {male, female, nonbinary}
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
With over 611 million monthly active users, building a huge Twitter following is not an easy task. These are the top 25 accounts with the most followers on Twitter right now.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Advertising makes up 89% of its total revenue and data licensing makes up about 11%.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
If you use the dataset, cite the paper: https://doi.org/10.1016/j.eswa.2022.117541
The most comprehensive dataset to date regarding climate change and human opinions via Twitter. It has the heftiest temporal coverage, spanning over 13 years, includes over 15 million tweets spatially distributed across the world, and provides the geolocation of most tweets. Seven dimensions of information are tied to each tweet, namely geolocation, user gender, climate change stance and sentiment, aggressiveness, deviations from historic temperature, and topic modeling, while accompanied by environmental disaster events information. These dimensions were produced by testing and evaluating a plethora of state-of-the-art machine learning algorithms and methods, both supervised and unsupervised, including BERT, RNN, LSTM, CNN, SVM, Naive Bayes, VADER, Textblob, Flair, and LDA.
The following columns are in the dataset:
➡ created_at: The timestamp of the tweet. ➡ id: The unique id of the tweet. ➡ lng: The longitude the tweet was written. ➡ lat: The latitude the tweet was written. ➡ topic: Categorization of the tweet in one of ten topics namely, seriousness of gas emissions, importance of human intervention, global stance, significance of pollution awareness events, weather extremes, impact of resource overconsumption, Donald Trump versus science, ideological positions on global warming, politics, and undefined. ➡ sentiment: A score on a continuous scale. This scale ranges from -1 to 1 with values closer to 1 being translated to positive sentiment, values closer to -1 representing a negative sentiment while values close to 0 depicting no sentiment or being neutral. ➡ stance: That is if the tweet supports the belief of man-made climate change (believer), if the tweet does not believe in man-made climate change (denier), and if the tweet neither supports nor refuses the belief of man-made climate change (neutral). ➡ gender: Whether the user that made the tweet is male, female, or undefined. ➡ temperature_avg: The temperature deviation in Celsius and relative to the January 1951-December 1980 average at the time and place the tweet was written. ➡ aggressiveness: That is if the tweet contains aggressive language or not.
Since Twitter forbids making public the text of the tweets, in order to retrieve it you need to do a process called hydrating. Tools such as Twarc or Hydrator can be used to hydrate tweets.
As of July 2023, 13 percent of male X (formerly Twitter) users in the United Kingdom reported using the service multiple times per day, compared to six percent of female X/Twitter users in the UK. Overall, 11 percent of male users accessed the platform daily. Elon Musk, the world’s richest person, bought the micro-blogging service in October 2022 and rebranded the platform in the months following the acquisition.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This project examines how to enhance users' exposure to and engagement with verified and ideologically balanced news in an ecologically valid setting. We rely on a large-scale two-week long field experiment on 28,457 Twitter users. We created 28 bots utilizing GPT-2 that replied to users tweeting about sports, entertainment, or lifestyle with a contextual reply containing two hardcoded elements: a URL to the topic-relevant section of quality news organization and an encouragement to follow its Twitter account. Treated users were randomly assigned to receive responses by bots presented as female or male. We examine whether our intervention enhances the following of news media organization, the sharing/liking of news content and the tweeting/liking of political content. We find that the treated users followed more news accounts and the users in the female bot treatment were more likely to like news content than the control.
As of February 2025, approximately 34.2 percent of X (formerly Twitter) users in the United Kingdom (UK) were women. By comparison, male users on the social network accounted for 65.8 percent of total users.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Twitter is ranked as the 12h most popular social media site in the world. The platform currently has 611 million active monthly users.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
One of the biggest advantages of Twitter is the speed at which information can be passed around. People use Twitter primarily to get news and for entertainment. This is the breakdown of why people use Twitter today.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
The dataset is a subset of the TBCOV dataset collected at QCRI filtered for mentions of personally related COVID-19 deaths. The filtering was done using regular expressions such as my * passed, my * died, my * succumbed & lost * battle. A sample of the dataset was annotated on Appen. Please see 'annotation-instructions.txt' for the full instructions provided to the annotators.
The "classifier_filtered_english.csv" file contains 33k deduplicated and classifier-filtered tweets (following X's content redistribution policy). for the 6 countries (Australia, Canada, India, Italy, United Kingdom, and United States) from March 2020 to March 2021 with classifier-labeled death labels, regular expression-filtered gender and relationship labels, and the user device label. The full 57k regex-filtered collection of tweets can be made available on special cases for Academics and Researchers.
date: the date of the tweet
country_name: the country name from Nominatim API
tweet_id: the ID of the tweet
url: the full URL of the tweet
full_text: the full-text content of the tweet (also includes the URL of any media attached)
does_the_tweet_refer_to_the_covidrelated_death_of_one_or_more_individuals_personally_known_to_the_tweets_author: the classifier predicted label for the death (also includes the original labels for the annotated samples)
what_is_the_relationship_between_the_tweets_author_and_the_victim_mentioned: the annotated relationship labels
relative_to_the_time_of_the_tweet_when_did_the_mentioned_death_occur: the annotated relative time labels
user_is_verified: if the user is verified or not
user_gender: the gender of the Twitter user (from the user profile)
user_device: the Twitter client the user uses
has_media: if the tweet has any attached media
has_url: if the tweet text contains a URL
matched_device: the device (Apple or Android) based on the Twitter client
regex_gender: the gender inferred from regular expression-based filtering
regex_relationship: the relationship label from regular expression-based filtering
We first determine the mapping between different relationship labels mentioned in the tweet to the gender. We do not use any relationship like "cousin" from which we cannot easily infer the gender.
Male relationships: 'father', 'dad', 'daddy', 'papa', 'pop', 'pa', 'son', 'brother', 'uncle', 'nephew', 'grandfather', 'grandpa', 'gramps', 'husband', 'boyfriend', 'fiancé', 'groom', 'partner', 'beau', 'friend', 'buddy', 'pal', 'mate', 'companion', 'boy', 'gentleman', 'man', 'father-in-law', 'brother-in-law', 'stepfather', 'stepbrother'
Female relationships: 'mother', 'mom', 'mama', 'mum', 'ma', 'daughter', 'sister', 'aunt', 'niece', 'grandmother', 'grandma', 'granny', 'wife', 'girlfriend', 'fiancée', 'bride', 'partner', 'girl', 'lady', 'woman', 'miss', 'mother-in-law', 'sister-in-law', 'stepmother', 'stepsister'
Based on these mappings, we used the following regex for each gender label to determine the gender of the deceased mentioned in the tweet.
"[m|M]y\s(" + "|".join([r + "s?" for r in relationships]) + ")\s(died|succumbed|deceased)"
First, we get the relationship labels using regex filtering, and then we group them into different age-group categories as shown in the following table. The UK and the US use different age groups because of the different age group definitions in the official data.
Category | Relationship (from tweets) | Age Group (UK) | Age Group (US) |
Grandparents | grandfather, grandmother | 65+ | 65+ |
Parents | father, mother, uncle, aunt | 45-64 | 35-64 |
Siblings | brother, sister, cousin | 15-44 | 15-34 |
Children | son, daughter, nephew, niece | 0-14 | 0-14 |
The 'english-training.csv' file contains about 13k deduplicated human-annotated tweets. We use a random seed (42) to create the train/test split. The model Covid-Bert-V2 was fine-tuned on the training set for 2 epochs with the following hyperparameters (obtained using 10-fold CV): random_seed: 42, batch_size: 32, dropout: 0.1. We obtained a F1-score of 0.81 on the test set. We used about 5% (671) of the combined and deduplicated annotated tweets as the test set, about 2% (255) as the validation set, and the remaining 12,494 tweets were used for fine-tuning the model. The tweets were preprocessed to replace mentions, URLs, emojis, etc with generic keywords. The model was trained on a system with a single Nvidia A4000 16GB GPU. The fine-tuned model is also available as the 'model.bin' file. The code for finetuning the model as well as reproducing the experiments are available in this GitHub repository.
We also include a datasheet for the dataset following the recommendation of "Datasheets for Datasets" (Gebru et. al.) which provides more information about how the dataset was created and how it can be used. Please see "Datasheet.pdf".
NOTE: We recommend that researchers try to rehydrate the individual tweets to ensure that the user has not deleted the tweet since posting. This gives users a mechanism to opt out of having their data analyzed.
Please only use your institutional email when requesting the dataset as anything else (like gmail.com) will be rejected. The dataset will only be made available on reasonable request for Academics and Researchers. Please mention why you need the dataset and how you plan to use the dataset when making a request.
As of February 2025, micro-blogging platform X (formerly Twitter) was more popular with men than women, with male audiences accounting for 63.7 percent of global users. Additionally, users between the ages of 25 and 34 were particularly active on X/Twitter, making up more than 37 percent of users worldwide. How many people use? Although X/Twitter holds its status as a mainstream social media site, it falls short in comparison to other well-known platforms in terms of user numbers. As of early 2022, X/Twitter had around 436 million monthly active users, whilst Meta’s Facebook reached almost three billion MAU. Overall, the United States is home to over 105 million X/Twitter users, making up Twitter’s largest audience base, followed by Japan, India, and the United Kingdom, respectively. How is Twitter used? X/Twitter is utilized by its audience for many different purposes. In May 2021, over 80 percent of high-volume X/Twitter users (defined as users who tweet around 20 times per month) in the United States reported using the platform for entertainment, whilst 78 percent said they used it as a way to stay informed. High-volume X/Twitter users were far more likely to use the service as a means of expressing their opinion. Furthermore, in 2022, over half of social media users in the U.S. used Twitter as a news resource.