100+ datasets found

s
Twitter bot profiling
researchdata.smu.edu.sg
figshare.com
pdf
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Living Analytics Research Centre (2023). Twitter bot profiling [Dataset]. http://doi.org/10.25440/smu.12062706.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.25440/smu.12062706.v1
Dataset updated
May 31, 2023
Dataset provided by
SMU Research Data Repository (RDR)
Authors
Living Analytics Research Centre
License
http://rightsstatements.org/vocab/InC/1.0/http://rightsstatements.org/vocab/InC/1.0/
Description
This dataset comprises a set of Twitter accounts in Singapore that are used for social bot profiling research conducted by the Living Analytics Research Centre (LARC) at Singapore Management University (SMU). Here a bot is defined as a Twitter account that generates contents and/or interacts with other users automatically (at least according to human judgment). In this research, Twitter bots have been categorized into three major types:

Broadcast bot. This bot aims at disseminating information to general audience by providing, e.g., benign links to news, blogs or sites. Such bot is often managed by an organization or a group of people (e.g., bloggers). Consumption bot. The main purpose of this bot is to aggregate contents from various sources and/or provide update services (e.g., horoscope reading, weather update) for personal consumption or use. Spam bot. This type of bots posts malicious contents (e.g., to trick people by hijacking certain account or redirecting them to malicious sites), or promotes harmless but invalid/irrelevant contents aggressively.

This categorization is general enough to cater for new, emerging types of bot (e.g., chatbots can be viewed as a special type of broadcast bots). The dataset was collected from 1 January to 30 April 2014 via the Twitter REST and streaming APIs. Starting from popular seed users (i.e., users having many followers), their follow, retweet, and user mention links were crawled. The data collection proceeds by adding those followers/followees, retweet sources, and mentioned users who state Singapore in their profile location. Using this procedure, a total of 159,724 accounts have been collected. To identify bots, the first step is to check active accounts who tweeted at least 15 times within the month of April 2014. These accounts were then manually checked and labelled, of which 589 bots were found. As many more human users are expected in the Twitter population, the remaining accounts were randomly sampled and manually checked. With this, 1,024 human accounts were identified. In total, this results in 1,613 labelled accounts. Related Publication: R. J. Oentaryo, A. Murdopo, P. K. Prasetyo, and E.-P. Lim. (2016). On profiling bots in social media. Proceedings of the International Conference on Social Informatics (SocInfo’16), 92-109. Bellevue, WA. https://doi.org/10.1007/978-3-319-47880-7_6
Twitter users in France 2019-2028
statista.com
Updated Mar 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Twitter users in France 2019-2028 [Dataset]. https://www.statista.com/forecasts/1144232/twitter-users-in-france
Explore at:
Dataset updated
Mar 3, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
France
Description
The number of Twitter users in France was forecast to continuously increase between 2024 and 2028 by in total 0.8 million users (+8.18 percent). After the ninth consecutive increasing year, the Twitter user base is estimated to reach 10.59 million users and therefore a new peak in 2028. Notably, the number of Twitter users of was continuously increasing over the past years.User figures, shown here regarding the platform twitter, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Twitter users in countries like Luxembourg and Netherlands.
Twitter users in the United States 2019-2028
statista.com
Updated Jun 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Twitter users in the United States 2019-2028 [Dataset]. https://www.statista.com/topics/3196/social-media-usage-in-the-united-states/
Explore at:
Dataset updated
Jun 13, 2024
Dataset provided by
Statistahttp://statista.com/
Authors
Statista Research Department
Area covered
United States
Description
The number of Twitter users in the United States was forecast to continuously increase between 2024 and 2028 by in total 4.3 million users (+5.32 percent). After the ninth consecutive increasing year, the Twitter user base is estimated to reach 85.08 million users and therefore a new peak in 2028. Notably, the number of Twitter users of was continuously increasing over the past years.User figures, shown here regarding the platform twitter, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Twitter users in countries like Canada and Mexico.
Twitter Friends
kaggle.com
Updated Sep 2, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hubert Wassner (2016). Twitter Friends [Dataset]. https://www.kaggle.com/hwassner/TwitterFriends/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 2, 2016
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Hubert Wassner
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Twitter Friends and hashtags

Context

This datasets is an extract of a wider database aimed at collecting Twitter user's friends (other accound one follows). The global goal is to study user's interest thru who they follow and connection to the hashtag they've used.

Content

It's a list of Twitter user's informations. In the JSON format one twitter user is stored in one object of this more that 40.000 objects list. Each object holds :

avatar : URL to the profile picture

followerCount : the number of followers of this user

friendsCount : the number of people following this user.

friendName : stores the @name (without the '@') of the user (beware this name can be changed by the user)

id : user ID, this number can not change (you can retrieve screen name with this service : https://tweeterid.com/)

friends : the list of IDs the user follows (data stored is IDs of users followed by this user)

lang : the language declared by the user (in this dataset there is only "en" (english))

lastSeen : the time stamp of the date when this user have post his last tweet.

tags : the hashtags (whith or without #) used by the user. It's the "trending topic" the user tweeted about.

tweetID : Id of the last tweet posted by this user.

You also have the CSV format which uses the same naming convention.

These users are selected because they tweeted on Twitter trending topics, I've selected users that have at least 100 followers and following at least 100 other account (in order to filter out spam and non-informative/empty accounts).

Acknowledgements

This data set is build by Hubert Wassner (me) using the Twitter public API. More data can be obtained on request (hubert.wassner AT gmail.com), at this time I've collected over 5 milions in different languages. Some more information can be found here (in french only) : http://wassner.blogspot.fr/2016/06/recuperer-des-profils-twitter-par.html

Past Research

No public research have been done (until now) on this dataset. I made a private application which is described here : http://wassner.blogspot.fr/2016/09/twitter-profiling.html (in French) which uses the full dataset (Millions of full profiles).

Inspiration

On can analyse a lot of stuff with this datasets :

stats about followers & followings

manyfold learning or unsupervised learning from friend list

hashtag prediction from friend list

Contact

Feel free to ask any question (or help request) via Twitter : @hwassner

Enjoy! ;)
s
Twitter Users Broken down By Country
searchlogistics.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Twitter Users Broken down By Country [Dataset]. https://www.searchlogistics.com/learn/statistics/twitter-user-statistics/
Explore at:
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The US has historically been the target country for Twitter since its launch in 2006. This is the full breakdown of Twitter users by country.
Twitter users in Mexico 2019-2028
statista.com
Updated Mar 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Twitter users in Mexico 2019-2028 [Dataset]. https://www.statista.com/forecasts/1144192/twitter-users-in-mexico
Explore at:
Dataset updated
Mar 3, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Mexico
Description
The number of Twitter users in Mexico was forecast to continuously increase between 2024 and 2028 by in total 3.1 million users (+23.7 percent). After the ninth consecutive increasing year, the Twitter user base is estimated to reach 16.21 million users and therefore a new peak in 2028. Notably, the number of Twitter users of was continuously increasing over the past years.User figures, shown here regarding the platform twitter, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Twitter users in countries like Canada and United States.
s
Twitter Revenue Growth
searchlogistics.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Twitter Revenue Growth [Dataset]. https://www.searchlogistics.com/learn/statistics/twitter-user-statistics/
Explore at:
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Advertising makes up 89% of its total revenue and data licensing makes up about 11%.
s
Data from: Twitter Users
searchlogistics.com
Updated Mar 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Twitter Users [Dataset]. https://www.searchlogistics.com/learn/statistics/social-media-user-statistics/
Explore at:
Dataset updated
Mar 17, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The average Twitter user spends 5.1 hours per month on the platform.
T
Twitter Statistics
searchlogistics.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Search Logistics, Twitter Statistics [Dataset]. https://www.searchlogistics.com/learn/statistics/twitter-user-statistics/
Explore at:
Dataset authored and provided by
Search Logistics
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
These Twitter user statistics will give you the complete story of where Twitter is at today and what the future looks like for the social media company.
Z
Data from: IA Tweets Analysis Dataset (Spanish)
data.niaid.nih.gov
produccioncientifica.uca.es
+1more
Updated Aug 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
IA Tweets Analysis Dataset (Spanish) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10821484
Explore at:
Dataset updated
Aug 3, 2024
Dataset provided by
Muñoz, Andrés
Serrano-Fernández, Alejandro
Balderas-Díaz, Sara
Guerrero-Contreras, Gabriel
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
General Description

This dataset comprises 4,038 tweets in Spanish, related to discussions about artificial intelligence (AI), and was created and utilized in the publication "Enhancing Sentiment Analysis on Social Media: Integrating Text and Metadata for Refined Insights," (10.1109/IE61493.2024.10599899) presented at the 20th International Conference on Intelligent Environments. It is designed to support research on public perception, sentiment, and engagement with AI topics on social media from a Spanish-speaking perspective. Each entry includes detailed annotations covering sentiment analysis, user engagement metrics, and user profile characteristics, among others.

Data Collection Method

Tweets were gathered through the Twitter API v1.1 by targeting keywords and hashtags associated with artificial intelligence, focusing specifically on content in Spanish. The dataset captures a wide array of discussions, offering a holistic view of the Spanish-speaking public's sentiment towards AI.

Dataset Content

ID: A unique identifier for each tweet.

text: The textual content of the tweet. It is a string with a maximum allowed length of 280 characters.

polarity: The tweet's sentiment polarity (e.g., Positive, Negative, Neutral).

favorite_count: Indicates how many times the tweet has been liked by Twitter users. It is a non-negative integer.

retweet_count: The number of times this tweet has been retweeted. It is a non-negative integer.

user_verified: When true, indicates that the user has a verified account, which helps the public recognize the authenticity of accounts of public interest. It is a boolean data type with two allowed values: True or False.

user_default_profile: When true, indicates that the user has not altered the theme or background of their user profile. It is a boolean data type with two allowed values: True or False.

user_has_extended_profile: When true, indicates that the user has an extended profile. An extended profile on Twitter allows users to provide more detailed information about themselves, such as an extended biography, a header image, details about their location, website, and other additional data. It is a boolean data type with two allowed values: True or False.

user_followers_count: The current number of followers the account has. It is a non-negative integer.

user_friends_count: The number of users that the account is following. It is a non-negative integer.

user_favourites_count: The number of tweets this user has liked since the account was created. It is a non-negative integer.

user_statuses_count: The number of tweets (including retweets) posted by the user. It is a non-negative integer.

user_protected: When true, indicates that this user has chosen to protect their tweets, meaning their tweets are not publicly visible without their permission. It is a boolean data type with two allowed values: True or False.

user_is_translator: When true, indicates that the user posting the tweet is a verified translator on Twitter. This means they have been recognized and validated by the platform as translators of content in different languages. It is a boolean data type with two allowed values: True or False.

Cite as

Guerrero-Contreras, G., Balderas-Díaz, S., Serrano-Fernández, A., & Muñoz, A. (2024, June). Enhancing Sentiment Analysis on Social Media: Integrating Text and Metadata for Refined Insights. In 2024 International Conference on Intelligent Environments (IE) (pp. 62-69). IEEE.

Potential Use Cases

This dataset is aimed at academic researchers and practitioners with interests in:

Sentiment analysis and natural language processing (NLP) with a focus on AI discussions in the Spanish language.

Social media analysis on public engagement and perception of artificial intelligence among Spanish speakers.

Exploring correlations between user engagement metrics and sentiment in discussions about AI.

Data Format and File Type

The dataset is provided in CSV format, ensuring compatibility with a wide range of data analysis tools and programming environments.

License

The dataset is available under the Creative Commons Attribution 4.0 International (CC BY 4.0) license, permitting sharing, copying, distribution, transmission, and adaptation of the work for any purpose, including commercial, provided proper attribution is given.
s
Twitter Key Statistics
searchlogistics.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Twitter Key Statistics [Dataset]. https://www.searchlogistics.com/learn/statistics/twitter-user-statistics/
Explore at:
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
These are the key Twitter user statistics that you need to know.
Data from: GeoCoV19: A Dataset of Hundreds of Millions of Multilingual...
zenodo.org
data.niaid.nih.gov
zip
Updated Jun 16, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Umair Qazi; Muhammad Imran; Muhammad Imran; Ferda Ofli; Ferda Ofli; Umair Qazi (2020). GeoCoV19: A Dataset of Hundreds of Millions of Multilingual COVID-19 Tweets with Location Information [Dataset]. http://doi.org/10.5281/zenodo.3878599
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3878599
Dataset updated
Jun 16, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Umair Qazi; Muhammad Imran; Muhammad Imran; Ferda Ofli; Ferda Ofli; Umair Qazi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We present GeoCoV19, a large-scale Twitter dataset related to the ongoing COVID-19 pandemic. The dataset has been collected over a period of 90 days from February 1 to May 1, 2020 and consists of more than 524 million multilingual tweets. As the geolocation information is essential for many tasks such as disease tracking and surveillance, we employed a gazetteer-based approach to extract toponyms from user location and tweet content to derive their geolocation information using the Nominatim (Open Street Maps) data at different geolocation granularity levels. In terms of geographical coverage, the dataset spans over 218 countries and 47K cities in the world. The tweets in the dataset are from more than 43 million Twitter users, including around 209K verified accounts. These users posted tweets in 62 different languages.
Twitter Dataset
brightdata.com
.json, .csv, .xlsx
Updated Jan 8, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bright Data (2024). Twitter Dataset [Dataset]. https://brightdata.com/products/datasets/twitter
Explore at:
.json, .csv, .xlsxAvailable download formats
Dataset updated
Jan 8, 2023
Dataset authored and provided by
Bright Datahttps://brightdata.com/
License
https://brightdata.com/licensehttps://brightdata.com/license
Area covered
Worldwide
Description
Utilize our Twitter dataset for diverse applications to enrich business strategies and market insights. Analyzing this dataset provides a comprehensive understanding of social media trends, empowering organizations to refine their communication and marketing strategies. Access the entire dataset or customize a subset to fit your needs. Popular use cases include market research to identify trending topics and hashtags, AI training by reviewing factors such as tweet content, retweets, and user interactions for predictive analytics, and trend forecasting by examining correlations between specific themes and user engagement to uncover emerging social media preferences.
X/Twitter: number of worldwide users 2019-2024
statista.com
flwrdeptvarieties.store
Updated Nov 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2023). X/Twitter: number of worldwide users 2019-2024 [Dataset]. https://www.statista.com/statistics/303681/twitter-users-worldwide/
Explore at:
Dataset updated
Nov 15, 2023
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Dec 2022
Area covered
Worldwide
Description
As of December 2022, X/Twitter's audience accounted for over 368 million monthly active users worldwide. This figure was projected to decrease to approximately 335 million by 2024, a decline of around five percent compared to 2022.
Twitter users in Brazil 2019-2028
statista.com
Updated Mar 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Twitter users in Brazil 2019-2028 [Dataset]. https://www.statista.com/forecasts/1146589/twitter-users-in-brazil
Explore at:
Dataset updated
Mar 3, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Brazil
Description
The number of Twitter users in Brazil was forecast to continuously increase between 2024 and 2028 by in total 3.4 million users (+15.79 percent). After the ninth consecutive increasing year, the Twitter user base is estimated to reach 24.96 million users and therefore a new peak in 2028. Notably, the number of Twitter users of was continuously increasing over the past years.User figures, shown here regarding the platform twitter, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).
SenTopX: A Benchmark Twitter Dataset for User Sentiment on Various Topics
zenodo.org
csv, zip
Updated May 27, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hina Qayyum; Hina Qayyum (2024). SenTopX: A Benchmark Twitter Dataset for User Sentiment on Various Topics [Dataset]. http://doi.org/10.5281/zenodo.11243662
Explore at:
zip, csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.11243662
Dataset updated
May 27, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Hina Qayyum; Hina Qayyum
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
May 25, 2024
Description
This is a longitudinal Twitter dataset of 143K users during the period 2017-2021. The following is the detail of all the files:

SenTopX_userIDs.txt: contains user IDs of 143K Twitter users.

userIDs_tweetIDs.zip: contains Tweet IDs of users, the name of the file is the user ID and the file contains the list of all the tweet IDs.

users_16_perspective_toxicity_scores.csv contains user IDs and 16 median Perspective API scores, the vector is shared as mean, median, and Gini Index of scores calculated over all tweets of a user.

LDAvis_top30_words_for_extracted_topics.csv contains the top 30 most relevant words extracted from each topic extracted by tweet-level topic modeling using the BERTweet topic model.

topic_modelling_statistics_per_user.csv contains important and relevant statistics related to topic modeling results:

1. user: This column represents the identifier for the user. Each row in the CSV corresponds to a specific user, and this column helps to track and differentiate between the users.

2. avg_topic_probability: This column contains the average probability of the topics for each user calculated across all of the tweets in order to compare users in a meaningful way. It represents the average likelihood that a particular user discusses various topics over the observed period.

3. maximum_topic_avg: This column holds the value of the highest average probability among all topics for each user. It indicates the topic that the user most frequently discusses, on average.

4. index_max_avg_topic_probability_200: This column specifies the index or identifier of the topic with the highest average probability out of 200 possible topics. It shows which topic (out of 200) the user discusses the most.

5. global_avg: This column includes the global average probability of topics across all users. It provides a baseline or overall average topic probability that can be used for comparative purposes.

6. max_global_avg: This column contains the maximum global average probability across all topics for all users. It identifies the most discussed topic across the entire user base.

7. index_max_global_avg: This column shows the index or identifier of the topic with the highest global average probability. It indicates which topic (out of 200) is the most popular across all users.

8. entropy_200_topic: This column represents the entropy of the topics for each user, calculated over 200 topics. Entropy measures the diversity or unpredictability in the user's discussion of topics, with higher entropy indicating more varied topic discussion.

In summary, these columns are used to analyze the topic engagement and preferences of users on a platform, highlighting the most frequently discussed topics, the variability in topic discussions, and how individual user behavior compares to overall trends.
Tweets Dataset
kaggle.com
Updated Feb 6, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marcos Martins Marchetti (2020). Tweets Dataset [Dataset]. https://www.kaggle.com/datasets/mmmarchetti/tweets-dataset/suggestions
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 6, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Marcos Martins Marchetti
Description
-This Dataset was gathered by crawling Twitter's REST API using the Python library tweepy 3. This dataset contains the tweets of the 20 most popular twitter users (with the most followers) whereby retweets are neglected. These accounts belong to public people, such as Katy Perry and Barack Obama, platforms, YouTube, Instagram, and television channels shows, e.g., CNN Breaking News and The Ellen Show. -Consequently, the dataset contains a mix of relatively structured tweets, tweets written in a formal and informative manner, and completely unstructured tweets written in a colloquial style. Unfortunately, the geocoordinates were not available for those tweets. - H -This Dataset has been used to generate reserach paper under title "Machine Learning Techniques for Anomalies Detection in Post Arrays". -Crawled attributes are: Author (Twitter User), Content (Tweet), Date_Time, id (Twitter User ID), language (Tweet Langugage), Number_of_Likes, Number_of_Shares. Overall: 52543 tweets of top 20 users in twitter Screen_Name #Tweets Time span (in days) TheEllenShow 3,147 - 662 jimmyfallon 3,123 - 1231 ArianaGrande 3,104 - 613 YouTube 3,077 - 411 KimKardashian 2,939 - 603 katyperry 2,924 - 1,598 selenagomez 2,913 - 2,266 rihanna 2,877 - 1,557 BarackObama 2,863 - 849 britneyspears 2,776 - 1,548 instagram 2,577 - 456 shakira 2,530 - 1,850 Cristiano 2,507 - 2,407 jtimberlake 2,478 - 2,491 ladygaga 2,329 - 894 Twitter 2,290 - 2,593 ddlovato 2,217 - 741 taylorswift13 2,029 - 2,091 justinbieber 2,000 - 664 cnnbrk 1,842 - 183 (2017)
d
Data from: Database of Indian Social Media Influencers on Twitter
search.dataone.org
dataverse.harvard.edu
Updated Nov 11, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Arya, Arshia; De, Soham; Mishra, Dibyendu; Shekhawat, Gazal; Sharma, Ankur; Panda, Anmol; M Lalani, Faisal; Singh, Parantak; Kommiya Mothilal, Ramaravind; Grover, Rynaa; Nishal, Sachita; Dash, Saloni; Rashid Shora, Shehla; Akbar, Syeda Zainab; Pal, Joyojeet (2023). Database of Indian Social Media Influencers on Twitter [Dataset]. http://doi.org/10.7910/DVN/T2CFHO
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/T2CFHO
Dataset updated
Nov 11, 2023
Dataset provided by
Harvard Dataverse
Authors
Arya, Arshia; De, Soham; Mishra, Dibyendu; Shekhawat, Gazal; Sharma, Ankur; Panda, Anmol; M Lalani, Faisal; Singh, Parantak; Kommiya Mothilal, Ramaravind; Grover, Rynaa; Nishal, Sachita; Dash, Saloni; Rashid Shora, Shehla; Akbar, Syeda Zainab; Pal, Joyojeet
Description
Databases of highly networked individuals have been indispensable in studying narratives and influence on social media. To support studies on Twitter in India, we present a systematically categorized database of accounts of influence on Twitter in India, identified and annotated through an iterative process of friends, networks, and self-described profile information, verified manually. We built an initial set of accounts based on the friend network of a seed set of accounts based on real-world renown in various fields, and then snowballed friends of friends\" multiple times, and rank ordered individuals based on the number of in-group connections, and overall followers. We then manually classified identified accounts under the categories of entertainment, sports, business, government, institutions, journalism, civil society accounts that have independent standing outside of social media, as well as a category ofdigital first" referring to accounts that derive their primary influence from online activity. Overall, we annotated 11580 unique accounts across all categories. The database is useful studying various questions related to the role of influencers in polarisation, misinformation, extreme speech, political discourse etc.
Z
COVID-19 Tweets : A dataset contaning more than 600k tweets on the novel...
data.niaid.nih.gov
zenodo.org
Updated Jan 23, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yassine Drias (2021). COVID-19 Tweets : A dataset contaning more than 600k tweets on the novel CoronaVirus [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4024176
Explore at:
Dataset updated
Jan 23, 2021
Dataset provided by
Yassine Drias
Habiba Drias
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains 653 996 tweets related to the Coronavirus topic and highlighted by hashtags such as: #COVID-19, #COVID19, #COVID, #Coronavirus, #NCoV and #Corona. The tweets' crawling period started on the 27th of February and ended on the 25th of March 2020, which is spread over four weeks.

The tweets were generated by 390 458 users from 133 different countries and were written in 61 languages. English being the most used language with almost 400k tweets, followed by Spanish with around 80k tweets.

The data is stored in as a CSV file, where each line represents a tweet. The CSV file provides information on the following fields:

Author: the user who posted the tweet

Recipient: contains the name of the user in case of a reply, otherwise it would have the same value as the previous field

Tweet: the full content of the tweet

Hashtags: the list of hashtags present in the tweet

Language: the language of the tweet

Relationship: gives information on the type of the tweet, whether it is a retweet, a reply, a tweet with a mention, etc.

Location: the country of the author of the tweet, which is unfortunately not always available

Date: the publication date of the tweet

Source: the device or platform used to send the tweet

The dataset can as well be used to construct a social graph since it includes the relations "Replies to", "Retweet", "MentionsInRetweet" and "Mentions".
s
Why Do People Use Twitter?
searchlogistics.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Why Do People Use Twitter? [Dataset]. https://www.searchlogistics.com/learn/statistics/twitter-user-statistics/
Explore at:
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
One of the biggest advantages of Twitter is the speed at which information can be passed around. People use Twitter primarily to get news and for entertainment. This is the breakdown of why people use Twitter today.

Facebook

Twitter

Click to copy link

Link copied

Cite

Living Analytics Research Centre (2023). Twitter bot profiling [Dataset]. http://doi.org/10.25440/smu.12062706.v1

Twitter bot profiling

Explore at:

pdfAvailable download formats

Unique identifier

https://doi.org/10.25440/smu.12062706.v1

Dataset updated

May 31, 2023

Dataset provided by

SMU Research Data Repository (RDR)

Authors

Living Analytics Research Centre

License

http://rightsstatements.org/vocab/InC/1.0/http://rightsstatements.org/vocab/InC/1.0/

Description

This dataset comprises a set of Twitter accounts in Singapore that are used for social bot profiling research conducted by the Living Analytics Research Centre (LARC) at Singapore Management University (SMU). Here a bot is defined as a Twitter account that generates contents and/or interacts with other users automatically (at least according to human judgment). In this research, Twitter bots have been categorized into three major types:

Broadcast bot. This bot aims at disseminating information to general audience by providing, e.g., benign links to news, blogs or sites. Such bot is often managed by an organization or a group of people (e.g., bloggers). Consumption bot. The main purpose of this bot is to aggregate contents from various sources and/or provide update services (e.g., horoscope reading, weather update) for personal consumption or use. Spam bot. This type of bots posts malicious contents (e.g., to trick people by hijacking certain account or redirecting them to malicious sites), or promotes harmless but invalid/irrelevant contents aggressively.

This categorization is general enough to cater for new, emerging types of bot (e.g., chatbots can be viewed as a special type of broadcast bots). The dataset was collected from 1 January to 30 April 2014 via the Twitter REST and streaming APIs. Starting from popular seed users (i.e., users having many followers), their follow, retweet, and user mention links were crawled. The data collection proceeds by adding those followers/followees, retweet sources, and mentioned users who state Singapore in their profile location. Using this procedure, a total of 159,724 accounts have been collected. To identify bots, the first step is to check active accounts who tweeted at least 15 times within the month of April 2014. These accounts were then manually checked and labelled, of which 589 bots were found. As many more human users are expected in the Twitter population, the remaining accounts were randomly sampled and manually checked. With this, 1,024 human accounts were identified. In total, this results in 1,613 labelled accounts. Related Publication: R. J. Oentaryo, A. Murdopo, P. K. Prasetyo, and E.-P. Lim. (2016). On profiling bots in social media. Proceedings of the International Conference on Social Informatics (SocInfo’16), 92-109. Bellevue, WA. https://doi.org/10.1007/978-3-319-47880-7_6

Clear search

Close search

Google apps

Main menu

Twitter bot profiling

Twitter users in France 2019-2028

Twitter users in the United States 2019-2028

Twitter Friends

Twitter Friends and hashtags

Context

Content

Acknowledgements

Past Research

Inspiration

Contact

Twitter Users Broken down By Country

Twitter users in Mexico 2019-2028

Twitter Revenue Growth

Data from: Twitter Users

Twitter Statistics

Data from: IA Tweets Analysis Dataset (Spanish)

Twitter Key Statistics

Data from: GeoCoV19: A Dataset of Hundreds of Millions of Multilingual...

Twitter Dataset

X/Twitter: number of worldwide users 2019-2024

Twitter users in Brazil 2019-2028

SenTopX: A Benchmark Twitter Dataset for User Sentiment on Various Topics

Tweets Dataset

Data from: Database of Indian Social Media Influencers on Twitter

COVID-19 Tweets : A dataset contaning more than 600k tweets on the novel...

Why Do People Use Twitter?

Twitter bot profiling