94 datasets found

m
Top 50 trending topics (trends) of Twitter for 2018 (one hour interval)
data.mendeley.com
Updated Feb 16, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Issa Annamoradnejad (2019). Top 50 trending topics (trends) of Twitter for 2018 (one hour interval) [Dataset]. http://doi.org/10.17632/d4ccnh588k.1
Explore at:
Unique identifier
https://doi.org/10.17632/d4ccnh588k.1
Dataset updated
Feb 16, 2019
Authors
Issa Annamoradnejad
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains top 50 trending topics (trends) of Twitter, obtained from Twitter Trends API in an hourly rate. For each hour, there exists a row in the dataset that contains the date, time, trending topic and the related tweets count (if available). Data is for more than 97% of 2018 which our script was available.
s
Twitter Trending Hashtags
searchlogistics.com
Updated May 25, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). Twitter Trending Hashtags [Dataset]. https://www.searchlogistics.com/learn/statistics/hashtags-statistics/
Explore at:
Dataset updated
May 25, 2023
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Hubspot’s research found that the following hashtags are the most popular right now.
Twitter Dataset
brightdata.com
.json, .csv, .xlsx
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bright Data (2025). Twitter Dataset [Dataset]. https://brightdata.com/products/datasets/twitter
Explore at:
.json, .csv, .xlsxAvailable download formats
Dataset authored and provided by
Bright Datahttps://brightdata.com/
License
https://brightdata.com/licensehttps://brightdata.com/license
Area covered
Worldwide
Description
Utilize our Twitter dataset for diverse applications to enrich business strategies and market insights. Analyzing this dataset provides a comprehensive understanding of social media trends, empowering organizations to refine their communication and marketing strategies. Access the entire dataset or customize a subset to fit your needs. Popular use cases include market research to identify trending topics and hashtags, AI training by reviewing factors such as tweet content, retweets, and user interactions for predictive analytics, and trend forecasting by examining correlations between specific themes and user engagement to uncover emerging social media preferences.
f
Data from: Use of hashtags in Hispanic digital native media
figshare.com
portalcientificovalencia.univeuropea.com
txt
Updated Feb 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Víctor Yeste (2024). Use of hashtags in Hispanic digital native media [Dataset]. http://doi.org/10.6084/m9.figshare.25160954.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.25160954.v1
Dataset updated
Feb 7, 2024
Dataset provided by
figshare
Authors
Víctor Yeste
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Author: Víctor Yeste. Universitat Politècnica de Valencia.In this paper, we have studied the result of retweets that have used hashtags by a set of digital native media (those born from and for the online environment) Spanish and Portuguese on Twitter from the point of view of trend analysis.The field of study, therefore, is that of Twitter accounts of generalist digital native media that have been born from and for the online environment in Spain and Portugal. A sample of 96 accounts has been extracted, 48 from Spanish media and the other 48 from Portuguese press. A type of quota sampling has been used using as sources: Regulatory Entity for Social Communication (https://www.erc.pt/pt/listagem-registos-na-erc) and New Media Observatory (https//nuevosmedios.es/).This study consists of two dimensions applied to the use of each hashtag:The type of sender of the tweets: the selected set of digital native media (prefix "m_" in the variables) or the set of Twitter users in general, using for the latter the Twitter API search functionality (prefix "tw_" in the variables).The language used by the sender of each tweet: Spanish (suffix "_es" in the variables) or Portuguese (suffix "_pt" in the variables).Taking into account these dimensions, the data have been taken making use of the following analysis variables, all of them in count mode, that is, adding up the total of the data captured by each tweet in which the hashtag analyzed has been included:Tweets (variable name "num_tweets")Retweets (variable name "retweet_count")Favorites (variable name "favorite_count")Followers of tweeters (variable name "user_num_followers")Tweets published by tweeters (variable name "user_num_tweets")Age in days of tweeters' Twitter accounts (variable name "user_age")Tweets that include a URL (variable name "url_inclusion")These analysis variables have served to perform a correlative research technique, through which regression analysis is performed to determine the relationships between the variables and analyze the possibility of predicting the number of retweets. To do this, two shots were taken: an initial shot (suffix "_t0" in the variables) and a shot 14 days later (suffix "_t1" in the variables).This dataset has contributed to the elaboration of the book chapter:Yeste-Moreno, V., Calduch-Losa, Á., & Serrano-Cobos, J. (2021). Análisis predictivo colectivo del uso de hashtags en medios nativos digitales hispanos. In Digital media: el papel de las redes sociales en el ecosistema educomunicativo en tiempos de Covid-19 (pp. 197-224). McGraw-Hill Interamericana de España.
o
COVID-19 Twitter Engagement Data
opendatabay.com
.undefined
Updated Jul 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Datasimple (2025). COVID-19 Twitter Engagement Data [Dataset]. https://www.opendatabay.com/data/web-social/222b5de3-34ba-460d-918b-d917fc82b075
Explore at:
.undefinedAvailable download formats
Dataset updated
Jul 8, 2025
Dataset authored and provided by
Datasimple
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
Data Science and Analytics
Description
This dataset focuses on Twitter engagement metrics related to the Coronavirus disease (COVID-19), an infectious disease caused by the SARS-CoV-2 virus [1]. It provides a detailed collection of tweets, including their text content, the accounts that posted them, any hashtags used, and the geographical locations associated with the accounts [1]. The dataset is valuable for understanding public discourse, information dissemination, and engagement patterns on Twitter concerning COVID-19, particularly for analysing how people experience mild to moderate symptoms and recover, or require medical attention [1].

Columns

Datetime: Represents the exact date and time a tweet was posted [2].

Tweet Id: A unique identifier assigned to each tweet [2].

Text: The actual content of the tweet [2].

Username: The display name of the tweet author [2].

Permalink: The direct link to the tweet on Twitter [2].

User: A link to the author's Twitter account [2].

Outlinks: Any external links included within the tweet [2].

CountLinks: The number of links present in the tweet [2].

ReplyCount: The total number of replies to that specific tweet [2].

RetweetCount: The total number of retweets of that specific tweet [2].

DateTime Count: A daily count of tweets, aggregated by date ranges [2].

Label Count: A count associated with specific ranges of tweet IDs or other engagement metrics, indicating the distribution of tweets within those ranges [3-5].

Distribution

The dataset is structured with daily tweet counts and covers a period from 10 January 2020 to 28 February 2020 [2, 6, 7]. It includes approximately 179,040 daily tweet entries during this timeframe, derived from the sum of daily counts and tweet ID counts [2, 3, 6-11]. Tweet activity shows distinct peaks, with notable increases in late January (e.g., 6,091 tweets between 23-24 January 2020) [2] and a significant surge in late February, reaching 47,643 tweets between 26-27 February 2020, followed by 42,289 and 44,824 in subsequent days [7, 10, 11]. The distribution of certain tweet engagement metrics, such as replies or retweets, indicates that a substantial majority of tweets (over 152,500 records) fall within lower engagement ranges (e.g., 0-43 or 0-1628.96), with fewer tweets showing very high engagement (e.g., only 1 record between 79819.04-81448.00) [4, 5]. The data file would typically be in CSV format [12].

Usage

This dataset is ideal for: * Data Science and Analytics projects focused on social media [1]. * Visualization of tweet trends and engagement over time. * Exploratory data analysis to uncover patterns in COVID-19 related discussions [1]. * Natural Language Processing (NLP) tasks, such as sentiment analysis or topic modelling on tweet content [1]. * Data cleaning and preparation exercises for social media data [1].

Coverage

The dataset has a global geographic scope [13]. It covers tweet data from 10 January 2020 to 28 February 2020 [2, 6, 7]. The content is specific to the Coronavirus disease (COVID-19) [1].

License

CC0

Who Can Use It

This dataset is particularly useful for: * Data scientists and analysts interested in social media trends and public health discourse [1]. * Researchers studying information spread and public sentiment during health crises. * Developers building AI and LLM data solutions [13]. * Individuals interested in exploratory analysis and data visualization of real-world social media data [1].

Dataset Name Suggestions

COVID-19 Twitter Engagement Data

SARS-CoV-2 Tweet Activity Log

Pandemic Social Media Discourse

Coronavirus Tweets Analytics

Global COVID-19 Tweet Metrics

Attributes

Original Data Source: Covid_19 Tweets Dataset
f
Data from: Twitter hashtag analysis of movie premieres in February 2022 in...
figshare.com
portalcientificovalencia.univeuropea.com
xlsx
Updated Feb 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Víctor Yeste (2024). Twitter hashtag analysis of movie premieres in February 2022 in the USA [Dataset]. http://doi.org/10.6084/m9.figshare.25163177.v2
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.25163177.v2
Dataset updated
Feb 7, 2024
Dataset provided by
figshare
Authors
Víctor Yeste
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United States
Description
Author: Víctor Yeste. Universitat Politècnica de Valencia.This work is an exploratory, quantitative, and not experimental study with an inductive inference type and a longitudinal follow-up. It analyzes movie data and tweets published by users using the official Twitter hashtags of movie premieres the week before, the same week, and the week after each release date.The scope of the study is the collection of movies released in February 2022 in the USA, and the object of the study includes them and the tweets that refer to the film in the 3 closest weeks to their premiere dates. The tweets recollected were classified by the week they were published, so they are classified by a time dimension called timepoint. The week before the release date has been designated as timepoint 1, the week of the release date is timepoint 2, and the week immediately afterward is timepoint 3. Another dimension that has been considered is if the movie has domestic production or not, which means that if one of the countries of origin is the United States, the movie is designated as domestic.The chosen variables are organized in two data tables, one for the movies and one for the collected tweets.Variables related to the movies:id: Internal id of the moviename: Title of the moviehashtag: Official hashtag of the moviecountries: List of countries of the movie, separated by a semicolonmpaa: Film ratings system by the Motion Picture Association of America. It is a completely voluntary rating system and ratings have no legal standing. The currently rating systems include G (general audiences), PG (parental guidance suggested), PG-13 (parents strongly cautioned), R (restricted, under 17 requires accompanying parent or adult guardian) and NC-17 (no one 17 and under admitted)(Film Ratings - Motion Picture Association, n.d.)genres: List of genres of the movie, e.g., Action or Thriller, separated by a semicolonrelease_date: Release date of the movie in a format YYYY-MM-DDopening_grosses: Amount of USA dollars that the movie obtained on the opening date (the first week after the release date)opening_theaters: Amount of USA theaters that released the movie on the opening date (the first week after the release date)rating_avg: Average rating of the movieVariables related to the tweets:id: Internal id of the tweetstatus_id: Twitter id of the tweetmovie_id: Internal id of the movietimepoint: Week number related to the movie premiere that the tweet was published on. “1” is the week before the movie release, “2” is the week after the movie release” and “3” is the second week after the movie release.author_id: Twitter id of the author of the tweetcreated_at: Date and time of the tweet, with format “YYYY-MM-DD HH:MM:SS”quote_count: Number of the tweet’s quotesreply_count: Number of the tweet’s repliesretweet_count: Number of the tweet’s retweetslike_count: Number of the tweet’s likessentiment: Sentiment analysis of the tweet’s content with a range from -1 (negative) to 1 (positive)This dataset has contributed to the elaboration of the book chapters:Yeste, Víctor; Calduch-Losa, Ángeles (2022). Genre classification of movie releases in the USA: Exploring data with Twitter hashtags. In Narrativas emergentes para la comunicación digital (pp. 1012-1044). Dykinson, S. L.Yeste, Víctor; Calduch-Losa, Ángeles (2022). Exploratory Twitter hashtag analysis of movie premieres in the USA. In Desafíos audiovisuales de la tecnología y los contenidos en la cultura digital (pp. 169-187). McGraw-Hill Interamericana de España S.L.Yeste, Víctor; Calduch-Losa, Ángeles (2022). ANOVA to study movie premieres in the USA and online conversation on Twitter. The case of rating average using data from official Twitter hashtags. In El mapa y la brújula. Navegando por las metodologías de investigación en comunicación (pp. 151-168). Editorial Fragua.
X/Twitter: number of worldwide users 2019-2024
statista.com
Updated Dec 13, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2022). X/Twitter: number of worldwide users 2019-2024 [Dataset]. https://www.statista.com/statistics/303681/twitter-users-worldwide/
Explore at:
Dataset updated
Dec 13, 2022
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Dec 2022
Area covered
Worldwide
Description
As of December 2022, X/Twitter's audience accounted for over *** million monthly active users worldwide. This figure was projected to ******** to approximately *** million by 2024, a ******* of around **** percent compared to 2022.
s
Twitter Key Statistics
searchlogistics.com
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Twitter Key Statistics [Dataset]. https://www.searchlogistics.com/learn/statistics/twitter-user-statistics/
Explore at:
Dataset updated
Apr 1, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
These are the key Twitter user statistics that you need to know.
T
Twitter Statistics
searchlogistics.com
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Search Logistics (2025). Twitter Statistics [Dataset]. https://www.searchlogistics.com/learn/statistics/twitter-user-statistics/
Explore at:
Dataset updated
Apr 1, 2025
Dataset authored and provided by
Search Logistics
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
These Twitter user statistics will give you the complete story of where Twitter is at today and what the future looks like for the social media company.
m
Twitter Hate Speech Dataset for the Saudi Dialect
data.mendeley.com
Updated Nov 1, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ali Alhazmi (2024). Twitter Hate Speech Dataset for the Saudi Dialect [Dataset]. http://doi.org/10.17632/c2jpnv9yk6.4
Explore at:
Unique identifier
https://doi.org/10.17632/c2jpnv9yk6.4
Dataset updated
Nov 1, 2024
Authors
Ali Alhazmi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Saudi Arabia
Description
The data was performed by employing standard Twitter API on Arabic tweets and code-mixing datasets. The Data was carried out for a duration of three months, specifically from April 2023 to June 2023. This was done via a combination of keyword, thread-based searches, and profile-based search approaches as. A total of 120 terms, including various versions, which were used to identify tweets containing code-mixing concerning regional hate speech. To conduct a thread-based search, we have incorporated hashtags that are related to contentious subjects that are deemed essential markers for hateful speech. Throughout the data-gathering phase, we kept an eye on Twitter trends and designated ten hashtags for information retrieval. Given that hateful tweets are usually less common than regular tweets, we expanded our dataset and improved the representation of the hate class by incorporating the most impactful terms from a lexicon of religious hate terms (Albadi et al., 2018). We gathered exclusively original Arabic tweets for all queries, excluding retweets and non-Arabic tweets. In all, we obtained 200,000 Twitter data, of which we sampled 35k tweets for annotation.
P
Famous Keyword Twitter Replies Dataset
paperswithcode.com
Updated Jun 16, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). Famous Keyword Twitter Replies Dataset [Dataset]. https://paperswithcode.com/dataset/famous-keyword-twitter-replies
Explore at:
Dataset updated
Jun 16, 2023
Description
The "Famous Keyword Twitter Replies Dataset" is a comprehensive collection of Twitter data that focuses on popular keywords and their associated replies. This dataset contains five essential columns that provide valuable insights into the Twitter conversation dynamics:

Keyword: This column represents the specific keyword or topic of interest that generated the original tweet. It helps identify the context or subject matter around which the conversation revolves.

Main_tweet: The main_tweet column contains the original tweet related to the keyword. It serves as the starting point or focal point of the conversation and often provides essential information or opinions on the given topic.

Main_likes: This column provides the number of likes received by the main_tweet. Likes serve as a measure of engagement and indicate the level of popularity or resonance of the original tweet within the Twitter community.

Reply: The reply column consists of the replies or responses to the main_tweet. These replies may include comments, opinions, additional information, or discussions related to the keyword or the original tweet itself. The replies help capture the diverse perspectives and conversations that emerge in response to the main_tweet.

Reply_likes: This column records the number of likes received by each reply. Similar to the main_likes column, the reply_likes column measures the level of engagement and popularity of individual replies. It enables the identification of particularly noteworthy or well-received replies within the dataset.

By analyzing this "Famous Keyword Twitter Replies Dataset," researchers, analysts, and data scientists can gain valuable insights into how popular keywords spark discussions on Twitter and how these discussions evolve through replies.

The dataset's information on likes allows for the evaluation of tweet and reply popularity, helping to identify influential or impactful content.

This dataset serves as a valuable resource for various applications, including sentiment analysis, trend identification, opinion mining, and understanding social media dynamics.

Number of tweets for each pairs of tweet and reply

Total has 17255 pairs of tweet/reply
Global Political tweets
kaggle.com
Updated Aug 23, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kash (2022). Global Political tweets [Dataset]. https://www.kaggle.com/kaushiksuresh147/political-tweets/tasks
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 23, 2022
Dataset provided by
Kaggle
Authors
Kash
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
https://techcrunch.com/wp-content/uploads/2015/10/twitter-politics.png" alt="">

Social media is becoming a key medium through which we communicate with each other: it is at the center of the very structures of our daily interactions. Yet this infiltration is not unique to interpersonal relations. Political leaders, governments, and states operate within this social media environment, wherein they continually address crises and institute damage control through platforms such as Twitter.

With the proliferation of the internet into mass masses, social media is emerging as a potential way of communication. It provides a direct channel to politicians for communicating, connecting, and engaging with the public. The power of social media, especially Twitter and Facebook has been proved by its successful application during recent US presidential elections and Arabian countries' revolts. In India too, as the general election is about to knock at the door during early 2014, political parties and leaders are trying to harness the power of social media.

Content

The tweets have the #Politics hashtag. The collection started on 24/7/2021, and will be updated on a daily basis.

Information regarding the data

The data totally consists of 1 lakh+ records with 13 columns. The description of the features is given below | No |Columns | Descriptions | | -- | -- | -- | | 1 | user_name | The name of the user, as they’ve defined it. | | 2 | user_location | The user-defined location for this account’s profile. | | 3 | user_description | The user-defined UTF-8 string describing their account. | | 4 | user_created | Time and date, when the account was created. | | 5 | user_followers | The number of followers an account currently has. | | 6 | user_friends | The number of friends an account currently has. | | 7 | user_favourites | The number of favorites an account currently has | | 8 | user_verified | When true, indicates that the user has a verified account | | 9 | date | UTC time and date when the Tweet was created | | 10 | text | The actual UTF-8 text of the Tweet | | 11 | hashtags | All the other hashtags posted in the tweet along with #Politics | | 12 | source | Utility used to post the Tweet, Tweets from the Twitter website have a source value - web | | 13 | is_retweet | Indicates whether this Tweet has been Retweeted by the authenticating user. |

Inspiration

You can use this data to dive into the subjects that use this hashtag, look to the geographical distribution, evaluate sentiments, and look at trends.
s
Twitter Revenue Growth
searchlogistics.com
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Twitter Revenue Growth [Dataset]. https://www.searchlogistics.com/learn/statistics/twitter-user-statistics/
Explore at:
Dataset updated
Apr 1, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Advertising makes up 89% of its total revenue and data licensing makes up about 11%.
Data from: Google Analytics & Twitter dataset from a movies, TV series and...
figshare.com
portalcientificovalencia.univeuropea.com
txt
Updated Feb 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Víctor Yeste (2024). Google Analytics & Twitter dataset from a movies, TV series and videogames website [Dataset]. http://doi.org/10.6084/m9.figshare.16553061.v4
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.16553061.v4
Dataset updated
Feb 7, 2024
Dataset provided by
Figsharehttp://figshare.com/
Authors
Víctor Yeste
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Author: Víctor Yeste. Universitat Politècnica de Valencia.The object of this study is the design of a cybermetric methodology whose objectives are to measure the success of the content published in online media and the possible prediction of the selected success variables.In this case, due to the need to integrate data from two separate areas, such as web publishing and the analysis of their shares and related topics on Twitter, has opted for programming as you access both the Google Analytics v4 reporting API and Twitter Standard API, always respecting the limits of these.The website analyzed is hellofriki.com. It is an online media whose primary intention is to solve the need for information on some topics that provide daily a vast number of news in the form of news, as well as the possibility of analysis, reports, interviews, and many other information formats. All these contents are under the scope of the sections of cinema, series, video games, literature, and comics.This dataset has contributed to the elaboration of the PhD Thesis:Yeste Moreno, VM. (2021). Diseño de una metodología cibermétrica de cálculo del éxito para la optimización de contenidos web [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/176009Data have been obtained from each last-minute news article published online according to the indicators described in the doctoral thesis. All related data are stored in a database, divided into the following tables:tesis_followers: User ID list of media account followers.tesis_hometimeline: data from tweets posted by the media account sharing breaking news from the web.status_id: Tweet IDcreated_at: date of publicationtext: content of the tweetpath: URL extracted after processing the shortened URL in textpost_shared: Article ID in WordPress that is being sharedretweet_count: number of retweetsfavorite_count: number of favoritestesis_hometimeline_other: data from tweets posted by the media account that do not share breaking news from the web. Other typologies, automatic Facebook shares, custom tweets without link to an article, etc. With the same fields as tesis_hometimeline.tesis_posts: data of articles published by the web and processed for some analysis.stats_id: Analysis IDpost_id: Article ID in WordPresspost_date: article publication date in WordPresspost_title: title of the articlepath: URL of the article in the middle webtags: Tags ID or WordPress tags related to the articleuniquepageviews: unique page viewsentrancerate: input ratioavgtimeonpage: average visit timeexitrate: output ratiopageviewspersession: page views per sessionadsense_adunitsviewed: number of ads viewed by usersadsense_viewableimpressionpercent: ad display ratioadsense_ctr: ad click ratioadsense_ecpm: estimated ad revenue per 1000 page viewstesis_stats: data from a particular analysis, performed at each published breaking news item. Fields with statistical values can be computed from the data in the other tables, but total and average calculations are saved for faster and easier further processing.id: ID of the analysisphase: phase of the thesis in which analysis has been carried out (right now all are 1)time: "0" if at the time of publication, "1" if 14 days laterstart_date: date and time of measurement on the day of publicationend_date: date and time when the measurement is made 14 days latermain_post_id: ID of the published article to be analysedmain_post_theme: Main section of the published article to analyzesuperheroes_theme: "1" if about superheroes, "0" if nottrailer_theme: "1" if trailer, "0" if notname: empty field, possibility to add a custom name manuallynotes: empty field, possibility to add personalized notes manually, as if some tag has been removed manually for being considered too generic, despite the fact that the editor put itnum_articles: number of articles analysednum_articles_with_traffic: number of articles analysed with traffic (which will be taken into account for traffic analysis)num_articles_with_tw_data: number of articles with data from when they were shared on the media’s Twitter accountnum_terms: number of terms analyzeduniquepageviews_total: total page viewsuniquepageviews_mean: average page viewsentrancerate_mean: average input ratioavgtimeonpage_mean: average duration of visitsexitrate_mean: average output ratiopageviewspersession_mean: average page views per sessiontotal: total of ads viewedadsense_adunitsviewed_mean: average of ads viewedadsense_viewableimpressionpercent_mean: average ad display ratioadsense_ctr_mean: average ad click ratioadsense_ecpm_mean: estimated ad revenue per 1000 page viewsTotal: total incomeretweet_count_mean: average incomefavorite_count_total: total of favoritesfavorite_count_mean: average of favoritesterms_ini_num_tweets: total tweets on the terms on the day of publicationterms_ini_retweet_count_total: total retweets on the terms on the day of publicationterms_ini_retweet_count_mean: average retweets on the terms on the day of publicationterms_ini_favorite_count_total: total of favorites on the terms on the day of publicationterms_ini_favorite_count_mean: average of favorites on the terms on the day of publicationterms_ini_followers_talking_rate: ratio of followers of the media Twitter account who have recently published a tweet talking about the terms on the day of publicationterms_ini_user_num_followers_mean: average followers of users who have spoken of the terms on the day of publicationterms_ini_user_num_tweets_mean: average number of tweets published by users who spoke about the terms on the day of publicationterms_ini_user_age_mean: average age in days of users who have spoken of the terms on the day of publicationterms_ini_ur_inclusion_rate: URL inclusion ratio of tweets talking about terms on the day of publicationterms_end_num_tweets: total tweets on terms 14 days after publicationterms_ini_retweet_count_total: total retweets on terms 14 days after publicationterms_ini_retweet_count_mean: average retweets on terms 14 days after publicationterms_ini_favorite_count_total: total bookmarks on terms 14 days after publicationterms_ini_favorite_count_mean: average of favorites on terms 14 days after publicationterms_ini_followers_talking_rate: ratio of media Twitter account followers who have recently posted a tweet talking about the terms 14 days after publicationterms_ini_user_num_followers_mean: average followers of users who have spoken of the terms 14 days after publicationterms_ini_user_num_tweets_mean: average number of tweets published by users who have spoken about the terms 14 days after publicationterms_ini_user_age_mean: the average age in days of users who have spoken of the terms 14 days after publicationterms_ini_ur_inclusion_rate: URL inclusion ratio of tweets talking about terms 14 days after publication.tesis_terms: data of the terms (tags) related to the processed articles.stats_id: Analysis IDtime: "0" if at the time of publication, "1" if 14 days laterterm_id: Term ID (tag) in WordPressname: Name of the termslug: URL of the termnum_tweets: number of tweetsretweet_count_total: total retweetsretweet_count_mean: average retweetsfavorite_count_total: total of favoritesfavorite_count_mean: average of favoritesfollowers_talking_rate: ratio of followers of the media Twitter account who have recently published a tweet talking about the termuser_num_followers_mean: average followers of users who were talking about the termuser_num_tweets_mean: average number of tweets published by users who were talking about the termuser_age_mean: average age in days of users who were talking about the termurl_inclusion_rate: URL inclusion ratio
Twitter trends/tweet for October 2017
kaggle.com
Updated Nov 1, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ryan Sloot (2017). Twitter trends/tweet for October 2017 [Dataset]. https://www.kaggle.com/rsloot/twitter-trendstweet-for-october-2017/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 1, 2017
Dataset provided by
Kaggle
Authors
Ryan Sloot
Description
Context

Twitter.

Content

Trends, mentions, hashtags of top trends in October 2017. Pulled from Twitter API throughout the month, script running to get data every 4 minutes (max).

Acknowledgements

We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.

Inspiration

Your data will be in front of the world's largest data science community. What questions do you want to see answered?
Z
Twitter and Google Trend data about heat waves in India 2010-2017
data.niaid.nih.gov
zenodo.org
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Francesca Cecinati (2020). Twitter and Google Trend data about heat waves in India 2010-2017 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1307995
Explore at:
Dataset updated
Jan 24, 2020
Dataset authored and provided by
Francesca Cecinati
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset contains:

1) The list of tweets corresponding to the keywords "heat wave india" and "heatwave india" between 2010 and 2017.

2) The daily count of the same tweets

3) The monthly Google Trends data corresponding to the keywords "heat wave", "heatwave", "heat wave india", and "heatwave india" limited to the searches from India in the period 2010-2017

The Twitter data has been obtained wth the Python package Get-Old-Tweets (https://github.com/Jefferson-Henrique/GetOldTweets-python); the Google Trends data are obtained from the Google Trends webpage (https://trends.google.com/trends/?geo=US).
Social Media Datasets
brightdata.com
.json, .csv, .xlsx
Updated Sep 7, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bright Data (2022). Social Media Datasets [Dataset]. https://brightdata.com/products/datasets/social-media
Explore at:
.json, .csv, .xlsxAvailable download formats
Dataset updated
Sep 7, 2022
Dataset authored and provided by
Bright Datahttps://brightdata.com/
License
https://brightdata.com/licensehttps://brightdata.com/license
Area covered
Worldwide
Description
Gain valuable insights with our comprehensive Social Media Dataset, designed to help businesses, marketers, and analysts track trends, monitor engagement, and optimize strategies. This dataset provides structured and reliable social media data from multiple platforms.

Dataset Features

User Profiles: Access public social media profiles, including usernames, bios, follower counts, engagement metrics, and more. Ideal for audience analysis, influencer marketing, and competitive research. Posts & Content: Extract posts, captions, hashtags, media (images/videos), timestamps, and engagement metrics such as likes, shares, and comments. Useful for trend analysis, sentiment tracking, and content strategy optimization. Comments & Interactions: Analyze user interactions, including replies, mentions, and discussions. This data helps brands understand audience sentiment and engagement patterns. Hashtag & Trend Tracking: Monitor trending hashtags, topics, and viral content across platforms to stay ahead of industry trends and consumer interests.

Customizable Subsets for Specific Needs Our Social Media Dataset is fully customizable, allowing you to filter data based on platform, region, keywords, engagement levels, or specific user profiles. Whether you need a broad dataset for market research or a focused subset for brand monitoring, we tailor the dataset to your needs.

Popular Use Cases

Brand Monitoring & Reputation Management: Track brand mentions, customer feedback, and sentiment analysis to manage online reputation effectively. Influencer Marketing & Audience Analysis: Identify key influencers, analyze engagement metrics, and optimize influencer partnerships. Competitive Intelligence: Monitor competitor activity, content performance, and audience engagement to refine marketing strategies. Market Research & Consumer Insights: Analyze social media trends, customer preferences, and emerging topics to inform business decisions. AI & Predictive Analytics: Leverage structured social media data for AI-driven trend forecasting, sentiment analysis, and automated content recommendations.

Whether you're tracking brand sentiment, analyzing audience engagement, or monitoring industry trends, our Social Media Dataset provides the structured data you need. Get started today and customize your dataset to fit your business objectives.
u
Data from: Hashtags used by museums Twitter accounts from REMED in 2021
portalcientificovalencia.univeuropea.com
figshare.com
Updated 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yeste, Víctor; Yeste, Víctor (2024). Hashtags used by museums Twitter accounts from REMED in 2021 [Dataset]. https://portalcientificovalencia.univeuropea.com/documentos/67321ed2aea56d4af0485dc2?lang=en
Explore at:
Dataset updated
2024
Authors
Yeste, Víctor; Yeste, Víctor
Description
Author: Víctor Yeste. Universitat Politècnica de Valencia.This study consists of quantitative, explanatory, and non-experimental research using inductive inference longitudinally. Thus, the use of hashtags by the Twitter accounts of the set of museums that are part of REMED is studied, and the analysis of hashtag trends by Twitter users in Spanish is performed.The primary variable is the favorite count, and it is hypothesized from this study that it is possible to predict the primary variable five weeks later. The field of study is formed by the 104 Twitter accounts of the museums that are part of REMED (Red de Museos y Estrategias Digitales).Seven analysis variables explain the information related to the use of hashtags, both in the size of the Twitter accounts of museums of the sample chosen (prefix "m_" in the variables) and Twitter users in Spanish in general (prefix "tw_" in variables). All variables represent the data in count mode, which means that they sum up the total of the data collected for each tweet of each hashtag processed:Number of tweets (variable name "num_tweets")Number of retweets (variable name "retweet_count") Number of favorites (variable name "favorite_count")Number of followers of tweeters (variable name "user_num_followers")Number of tweets published by tweeters (variable name "user_num_tweets")Age in days of tweeters' Twitter accounts (variable name "user_age")Number of tweets including a URL (variable name "url_inclusion")With the variables above, an investigation has been carried out by checking the correlations between the variables and performing a regression analysis. Thus, the relationships between the variables are ascertained and analyzed to determine if it is possible to predict the number of favorites of the hashtags used by museums. The first initial intake is presented with the suffix "_t0" in the columns, and the intake made 5 weeks later is presented with the suffix "_t1" in the columnsThis dataset has contributed to the elaboration of the book chapter:Yeste Moreno, V.; Calduch-Losa, Á.; Serrano-Cobos, J. (2022). Estudio predictivo del uso colectivo de hashtags en museos de la red REMED. En CIMED21 - I Congreso internacional de museos y estrategias digitales. Editorial Universitat Politècnica de València. 251-265. https://doi.org/10.4995/CIMED21.2021.12281
s
Twitter Users Broken Down By Age
searchlogistics.com
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Twitter Users Broken Down By Age [Dataset]. https://www.searchlogistics.com/learn/statistics/twitter-user-statistics/
Explore at:
Dataset updated
Apr 1, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the breakdown of Twitter users by age group.
Tweet Sentiment's Impact on Stock Returns
kaggle.com
Updated Jan 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). Tweet Sentiment's Impact on Stock Returns [Dataset]. https://www.kaggle.com/datasets/thedevastator/tweet-sentiment-s-impact-on-stock-returns
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 16, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
The Devastator
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Tweet Sentiment's Impact on Stock Returns

862,231 Labeled Instances

By [source]

About this dataset

This dataset contains 862,231 labeled tweets and associated stock returns, providing a comprehensive look into the impact of social media on company-level stock market performance. For each tweet, researchers have extracted data such as the date of the tweet and its associated stock symbol, along with metrics such as last price and various returns (1-day return, 2-day return, 3-day return, 7-day return). Also recorded are volatility scores for both 10 day intervals and 30 day intervals. Finally, sentiment scores from both Long Short - Term Memory (LSTM) and TextBlob models have been included to quantify the overall tone in which these messages were delivered. With this dataset you will be able to explore how tweets can affect a company's share prices both short term and long term by leveraging all of these data points for analysis!

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

In order to use this dataset, users can utilize descriptive statistics such as histograms or regression techniques to establish relationships between tweet content & sentiment with corresponding stock return data points such as 1-day & 7-day returns measurements.

The primary fields used for analysis include Tweet Text (TWEET), Stock symbol (STOCK), Date (DATE), Closing Price at the time of Tweet (LAST_PRICE) a range of Volatility measures 10 day Volatility(VOLATILITY_10D)and 30 day Volatility(VOLATILITY_30D ) for each Stock which capture changes in market fluctuation during different periods around when Twitter reactions occur. Additionally Sentiment Polarity analysis undertaken via two Machine learning algorithms LSTM Polarity(LSTM_POLARITY)and Textblob polarity provide insight into whether people are expressing positive or negative sentiments about each company at given times which again could influence thereby potentially influence Stock Prices over shorter term periods like 1-Day Returns(1_DAY_RETURN),2-Day Returns(2_DAY_RETURN)or longer term horizon like 7 Day Returns*7DAY RETURNS*.Finally MENTION field indicates if names/acronyms associated with Companies were specifically mentioned in each Tweet or not which gives extra insight into whether company specific contexts were present within individual Tweets aka “Company Relevancy”

Research Ideas

Analyzing the degree to which tweets can influence stock prices. By analyzing relationships between variables such as tweet sentiment and stock returns, correlations can be identified that could be used to inform investment decisions.

Exploring natural language processing (NLP) models for predicting future market trends based on textual data such as tweets. Through testing and evaluating different text-based models using this dataset, better predictive models may emerge that can give investors advance warning of upcoming market shifts due to news or other events.

Investigating the impact of different types of tweets (positive/negative, factual/opinionated) on stock prices over specific time frames. By studying correlations between the sentiment or nature of a tweet and its effect on stocks, insights may be gained into what sort of news or events have a greater impact on markets in general

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: reduced_dataset-release.csv | Column name | Description | |:----------------------|:-------------------------------------------------------------------------------------------------------| | TWEET | Text of the tweet. (String) | | STOCK | Company's stock mentioned in the tweet. (String) | | DATE | Date the tweet was posted. (Date) | | LAST_PRICE | Company's last price at the time of tweeting. (Float) ...