100+ datasets found

Famous Words Twitter Dataset
kaggle.com
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
_w1998 (2023). Famous Words Twitter Dataset [Dataset]. https://www.kaggle.com/datasets/jackksoncsie/twitter-dataset-keywords-likes-and-tweets/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 30, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
_w1998
License
http://www.gnu.org/licenses/agpl-3.0.htmlhttp://www.gnu.org/licenses/agpl-3.0.html
Description
The Famous Words Twitter Dataset is a comprehensive collection of tweets associated with famous words. The dataset provides valuable insights into the social media engagement and popularity of these words on the Twitter platform. It includes three primary columns: keyword, likes, and tweets.

The keyword column represents the specific famous word or phrase associated with each tweet. It allows researchers and analysts to explore the dynamics of user interactions and discussions surrounding these popular terms on Twitter.

The likes column indicates the number of likes received by each tweet. This metric serves as an indicator of the tweet's popularity and resonation among Twitter users.

The tweet column contains the actual tweet text, capturing the content and context of user-generated messages related to the famous words. This column provides valuable qualitative data for sentiment analysis, topic modeling, and other natural language processing tasks.

Researchers, data scientists, and social media analysts can leverage this dataset to study various aspects, such as tracking trends, sentiment analysis, understanding user engagement patterns, and identifying influential topics associated with famous words on Twitter.

Topics: "COVID-19", "Vaccine", "Zoom", "Bitcoin", "Dogecoin", "NFT", "Elon Musk", "Tesla", "Amazon", "iPhone 12", "Remote work", "TikTok", "Instagram", "Facebook", "YouTube", "Netflix", "GameStop", "Super Bowl", "Olympics", "Black Lives Matter" "India vs England", "Ukraine", "Queen Elizabeth", "World Cup", "Jeffrey Dahmer", "Johnny Depp", "Will Smith", "Weather", "xvideo", "porn", "nba", "Macdonald",

Total has 128837 tweets, and here are the plot for each number of tweets for different keyword

https://i.imgur.com/z4xbbyt.png" alt="">

Note: The dataset is carefully curated, anonymized, and stripped of any personally identifiable information to protect user privacy.
Twitter Dataset
brightdata.com
.json, .csv, .xlsx
Updated Dec 23, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bright Data (2024). Twitter Dataset [Dataset]. https://brightdata.com/products/datasets/twitter
Explore at:
.json, .csv, .xlsxAvailable download formats
Dataset updated
Dec 23, 2024
Dataset authored and provided by
Bright Datahttps://brightdata.com/
License
https://brightdata.com/licensehttps://brightdata.com/license
Area covered
Worldwide
Description
Utilize our Twitter dataset for diverse applications to enrich business strategies and market insights. Analyzing this dataset provides a comprehensive understanding of social media trends, empowering organizations to refine their communication and marketing strategies. Access the entire dataset or customize a subset to fit your needs. Popular use cases include market research to identify trending topics and hashtags, AI training by reviewing factors such as tweet content, retweets, and user interactions for predictive analytics, and trend forecasting by examining correlations between specific themes and user engagement to uncover emerging social media preferences.
Twitter Tweets Sentiment Dataset
kaggle.com
Updated Apr 8, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
M Yasser H (2022). Twitter Tweets Sentiment Dataset [Dataset]. https://www.kaggle.com/datasets/yasserh/twitter-tweets-sentiment-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 8, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
M Yasser H
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
https://raw.githubusercontent.com/Masterx-AI/Project_Twitter_Sentiment_Analysis_/main/twitt.jpg" alt="">

Description:

Twitter is an online Social Media Platform where people share their their though as tweets. It is observed that some people misuse it to tweet hateful content. Twitter is trying to tackle this problem and we shall help it by creating a strong NLP based-classifier model to distinguish the negative tweets & block such tweets. Can you build a strong classifier model to predict the same?

Each row contains the text of a tweet and a sentiment label. In the training set you are provided with a word or phrase drawn from the tweet (selected_text) that encapsulates the provided sentiment.

Make sure, when parsing the CSV, to remove the beginning / ending quotes from the text field, to ensure that you don't include them in your training.

You're attempting to predict the word or phrase from the tweet that exemplifies the provided sentiment. The word or phrase should include all characters within that span (i.e. including commas, spaces, etc.)

Columns:

textID - unique ID for each piece of text

text - the text of the tweet

sentiment - the general sentiment of the tweet

Acknowledgement:

The dataset is download from Kaggle Competetions:
https://www.kaggle.com/c/tweet-sentiment-extraction/data?select=train.csv

Objective:

Understand the Dataset & cleanup (if required).

Build classification models to predict the twitter sentiments.

Compare the evaluation metrics of vaious classification algorithms.
s
Twitter Revenue Growth
searchlogistics.com
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Twitter Revenue Growth [Dataset]. https://www.searchlogistics.com/learn/statistics/twitter-user-statistics/
Explore at:
Dataset updated
Apr 1, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Advertising makes up 89% of its total revenue and data licensing makes up about 11%.
Twitter Profiles Dataset
brightdata.com
.json, .csv, .xlsx
Updated Nov 27, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bright Data (2024). Twitter Profiles Dataset [Dataset]. https://brightdata.com/products/datasets/twitter/profiles
Explore at:
.json, .csv, .xlsxAvailable download formats
Dataset updated
Nov 27, 2024
Dataset authored and provided by
Bright Datahttps://brightdata.com/
License
https://brightdata.com/licensehttps://brightdata.com/license
Area covered
Worldwide
Description
Leverage our Twitter profiles dataset for a wide range of applications to enhance business strategies and market insights. Analyzing this dataset offers a deep understanding of user demographics, engagement patterns, and online behavior, enabling organizations to optimize their communication and marketing strategies. Access the complete dataset or tailor a subset to meet your specific requirements. Popular use cases include market research to identify influential profiles and emerging audiences, AI training by analyzing follower demographics and engagement data for predictive modeling, and trend forecasting by examining correlations between user bios, activity levels, and growth metrics to uncover evolving social media dynamics.
s
Twitter cascade dataset
researchdata.smu.edu.sg
smu.edu.sg
+1more
pdf
Updated May 31, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Living Analytics Research Centre (2023). Twitter cascade dataset [Dataset]. http://doi.org/10.25440/smu.12062709.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.25440/smu.12062709.v1
Dataset updated
May 31, 2023
Dataset provided by
SMU Research Data Repository (RDR)
Authors
Living Analytics Research Centre
License
http://rightsstatements.org/vocab/InC/1.0/http://rightsstatements.org/vocab/InC/1.0/
Description
This dataset comprises a set of information cascades generated by Singapore Twitter users. Here a cascade is defined as a set of tweets about the same topic. This dataset was collected via the Twitter REST and streaming APIs in the following way. Starting from popular seed users (i.e., users having many followers), we crawled their follow, retweet, and user mention links. We then added those followers/followees, retweet sources, and mentioned users who state Singapore in their profile location. With this, we have a total of 184,794 Twitter user accounts. Then tweets are crawled from these users from 1 April to 31 August 2012. In all, we got 32,479,134 tweets. To identify cascades, we extracted all the URL links and hashtags from the above tweets. And these URL links and hashtags are considered as the identities of cascades. In other words, all the tweets which contain the same URL link (or the same hashtag) represent a cascade. Mathematically, a cascade is represented as a set of user-timestamp pairs. Figure 1 provides an example, i.e. cascade C = {< u1, t1 >, < u2, t2 >, < u1, t3 >, < u3, t4 >, < u4, t5 >}. For evaluation, the dataset was split into two parts: four months data for training and the last one month data for testing. Table 1summarizes the basic (count) statistics of the dataset. Each line in each file represents a cascade. The first term in each line is a hashtag or URL, the second term is a list of user-timestamp pairs. Due to privacy concerns, all user identities are anonymized.
s
Twitter Key Statistics
searchlogistics.com
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Twitter Key Statistics [Dataset]. https://www.searchlogistics.com/learn/statistics/twitter-user-statistics/
Explore at:
Dataset updated
Apr 1, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
These are the key Twitter user statistics that you need to know.
X/Twitter: Countries with the largest audience 2025
statista.com
Updated Jun 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). X/Twitter: Countries with the largest audience 2025 [Dataset]. https://www.statista.com/statistics/242606/number-of-active-twitter-users-in-selected-countries/
Explore at:
Dataset updated
Jun 19, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Feb 2025
Area covered
Worldwide
Description
Social network X/Twitter is particularly popular in the United States, and as of February 2025, the microblogging service had an audience reach of 103.9 million users in the country. Japan and the India were ranked second and third with more than 70 million and 25 million users respectively. Global Twitter usage As of the second quarter of 2021, X/Twitter had 206 million monetizable daily active users worldwide. The most-followed Twitter accounts include figures such as Elon Musk, Justin Bieber and former U.S. president Barack Obama. X/Twitter and politics X/Twitter has become an increasingly relevant tool in domestic and international politics. The platform has become a way to promote policies and interact with citizens and other officials, and most world leaders and foreign ministries have an official Twitter account. Former U.S. president Donald Trump used to be a prolific Twitter user before the platform permanently suspended his account in January 2021. During an August 2018 survey, 61 percent of respondents stated that Trump's use of Twitter as President of the United States was inappropriate.
f
Data from: Early prediction and characterization of high-impact world events...
figshare.com
txt
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mauricio Quezada; jkalyana@ucsd.edu; bpoblete@dcc.uchile.cl; gert@ece.ucsd.edu (2023). Early prediction and characterization of high-impact world events using social media [Dataset]. http://doi.org/10.6084/m9.figshare.3465974.v4
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.3465974.v4
Dataset updated
Jun 1, 2023
Dataset provided by
figshare
Authors
Mauricio Quezada; jkalyana@ucsd.edu; bpoblete@dcc.uchile.cl; gert@ece.ucsd.edu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
World
Description
This dataset consists on 5234 news events obtained from Twitter. The file tweets.csv.gz (available upon request via email to the authors) contains a CSV file, called tweets.csv, with all the tweets IDs corresponding to each event in events.csv. The format of each line of the file is the following:tweet_id, event_idWhere:tweet_id is an long number indicating the Twitter ID of the given tweet. Using the Twitter REST API it is possible to retrieve all the information about the given tweet.event_id corresponds to the event ID of the given tweet. The file events.csv.gz contains a CSV file, called events.csv with all the news events captured from Twitter since August, 2013 until June, 2014. The format of each line of the file is the following:

event_ID,date,total_keywords,total_tweets,keywords

Where:

event_ID is an integer which identifies the corresponding event. There are 5234 events, then event_ID ranges from 1 to 5234. date is the date of the event or connected component. The format is YYYY-MM-DD. total_keywords is an integer indicating how many keywords are in the event or connected component. total_tweets is an integer indicating how many tweets belongs to this event. keywords is a string containing total keywords keywords. There is a semicolon between two keywords.

The files cluster_labels.txt and time_resolutions.txt contain the cluster labels for each event and the time resolutions learned from all events, respectively.

cluster_labels.txt contains one integer number per line, from 0 to 19. In line i, the cluster label in that line corresponds to the event ID number i. time_resolutions.txt contains one floating point number per line, indicating the time resolution learned for all events, in minutes. There are 20 numbers in the file, one per line, in increasing order, with at most 13 decimal numbers after the point.
T
Twitter Statistics
searchlogistics.com
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Search Logistics (2025). Twitter Statistics [Dataset]. https://www.searchlogistics.com/learn/statistics/twitter-user-statistics/
Explore at:
Dataset updated
Apr 1, 2025
Dataset authored and provided by
Search Logistics
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
These Twitter user statistics will give you the complete story of where Twitter is at today and what the future looks like for the social media company.
s
Twitter Users Broken down By Country
searchlogistics.com
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Twitter Users Broken down By Country [Dataset]. https://www.searchlogistics.com/learn/statistics/twitter-user-statistics/
Explore at:
Dataset updated
Apr 1, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The US has historically been the target country for Twitter since its launch in 2006. This is the full breakdown of Twitter users by country.
d
Data from: Twitter Big Data as A Resource For Exoskeleton Research: A...
search.dataone.org
Updated Nov 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Thakur, Nirmalya (2023). Twitter Big Data as A Resource For Exoskeleton Research: A Large-Scale Dataset of about 140,000 Tweets and 100 Research Questions [Dataset]. http://doi.org/10.7910/DVN/VPPTRF
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/VPPTRF
Dataset updated
Nov 8, 2023
Dataset provided by
Harvard Dataverse
Authors
Thakur, Nirmalya
Description
Please cite the following paper when using this dataset: N. Thakur, “Twitter Big Data as a Resource for Exoskeleton Research: A Large-Scale Dataset of about 140,000 Tweets and 100 Research Questions,” Preprints, 2022, DOI: 10.20944/preprints202206.0383.v1 Abstract The exoskeleton technology has been rapidly advancing in the recent past due to its multitude of applications and use cases in assisted living, military, healthcare, firefighting, and industries. With the projected increase in the diverse uses of exoskeletons in the next few years in these application domains and beyond, it is crucial to study, interpret, and analyze user perspectives, public opinion, reviews, and feedback related to exoskeletons, for which a dataset is necessary. The Internet of Everything era of today's living, characterized by people spending more time on the Internet than ever before, holds the potential for developing such a dataset by mining relevant web behavior data from social media communications, which have increased exponentially in the last few years. Twitter, one such social media platform, is highly popular amongst all age groups, who communicate on diverse topics including but not limited to news, current events, politics, emerging technologies, family, relationships, and career opportunities, via tweets, while sharing their views, opinions, perspectives, and feedback towards the same. Therefore, this work presents a dataset of about 140,000 Tweets related to exoskeletons. that were mined for a period of 5-years from May 21, 2017, to May 21, 2022. The tweets contain diverse forms of communications and conversations which communicate user interests, user perspectives, public opinion, reviews, feedback, suggestions, etc., related to exoskeletons. Instructions: This dataset contains about 140,000 Tweets related to exoskeletons. that were mined for a period of 5-years from May 21, 2017, to May 21, 2022. The tweets contain diverse forms of communications and conversations which communicate user interests, user perspectives, public opinion, reviews, feedback, suggestions, etc., related to exoskeletons. The dataset contains only tweet identifiers (Tweet IDs) due to the terms and conditions of Twitter to re-distribute Twitter data only for research purposes. They need to be hydrated to be used. The process of retrieving a tweet's complete information (such as the text of the tweet, username, user ID, date and time, etc.) using its ID is known as the hydration of a tweet ID. The Hydrator application (link to download the application: https://github.com/DocNow/hydrator/releases and link to a step-by-step tutorial: https://towardsdatascience.com/learn-how-to-easily-hydrate-tweets-a0f393ed340e#:~:text=Hydrating%20Tweets) or any similar application may be used for hydrating this dataset. Data Description This dataset consists of 7 .txt files. The following shows the number of Tweet IDs and the date range (of the associated tweets) in each of these files. Filename: Exoskeleton_TweetIDs_Set1.txt (Number of Tweet IDs – 22945, Date Range of Tweets - July 20, 2021 – May 21, 2022) Filename: Exoskeleton_TweetIDs_Set2.txt (Number of Tweet IDs – 19416, Date Range of Tweets - Dec 1, 2020 – July 19, 2021) Filename: Exoskeleton_TweetIDs_Set3.txt (Number of Tweet IDs – 16673, Date Range of Tweets - April 29, 2020 - Nov 30, 2020) Filename: Exoskeleton_TweetIDs_Set4.txt (Number of Tweet IDs – 16208, Date Range of Tweets - Oct 5, 2019 - Apr 28, 2020) Filename: Exoskeleton_TweetIDs_Set5.txt (Number of Tweet IDs – 17983, Date Range of Tweets - Feb 13, 2019 - Oct 4, 2019) Filename: Exoskeleton_TweetIDs_Set6.txt (Number of Tweet IDs – 34009, Date Range of Tweets - Nov 9, 2017 - Feb 12, 2019) Filename: Exoskeleton_TweetIDs_Set7.txt (Number of Tweet IDs – 11351, Date Range of Tweets - May 21, 2017 - Nov 8, 2017) Here, the last date for May is May 21 as it was the most recent date at the time of data collection. The dataset would be updated soon to incorporate more recent tweets.
s
Twitter Users Broken Down By Age
searchlogistics.com
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Twitter Users Broken Down By Age [Dataset]. https://www.searchlogistics.com/learn/statistics/twitter-user-statistics/
Explore at:
Dataset updated
Apr 1, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the breakdown of Twitter users by age group.
Twitter users in the United States 2019-2028
statista.com
ai-chatbox.pro
Updated Jul 31, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista Research Department (2025). Twitter users in the United States 2019-2028 [Dataset]. https://www.statista.com/topics/3196/social-media-usage-in-the-united-states/
Explore at:
Dataset updated
Jul 31, 2025
Dataset provided by
Statistahttp://statista.com/
Authors
Statista Research Department
Area covered
United States
Description
The number of Twitter users in the United States was forecast to continuously increase between 2024 and 2028 by in total 4.3 million users (+5.32 percent). After the ninth consecutive increasing year, the Twitter user base is estimated to reach 85.08 million users and therefore a new peak in 2028. Notably, the number of Twitter users of was continuously increasing over the past years.User figures, shown here regarding the platform twitter, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Twitter users in countries like Canada and Mexico.
Data from: Twitter Dataset on the Russo-Ukrainian War
zenodo.org
data.niaid.nih.gov
Updated Oct 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Shevtsov; Alexander Shevtsov; Despoina Antonakaki; Despoina Antonakaki; Ioannis Lamprou; Sotiris Ioannidis; Sotiris Ioannidis; Polyvios Pratikakis; Polyvios Pratikakis; Ioannis Lamprou (2023). Twitter Dataset on the Russo-Ukrainian War [Dataset]. http://doi.org/10.5281/zenodo.8431047
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.8431047
Dataset updated
Oct 20, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Alexander Shevtsov; Alexander Shevtsov; Despoina Antonakaki; Despoina Antonakaki; Ioannis Lamprou; Sotiris Ioannidis; Sotiris Ioannidis; Polyvios Pratikakis; Polyvios Pratikakis; Ioannis Lamprou
Time period covered
Feb 23, 2022
Area covered
Ukraine
Description
On 24 February 2022, Russia invaded Ukraine, also known now as the Russo-Ukrainian War. We obtained our dataset through Twitter API from 23 February of 2022 until 23 June of 2023. The collected dataset has 127.275.386 tweets, shared in the form of anonymized text, where the tweet/user IDs and user mentions are anonymized and do not provide any personal information. The provided dataset contains user discussion in more than 70 languages, where the 20 most popular are : 'eng', 'fr', 'de', 'mix', 'it', 'es', 'ja', 'ru', 'pl', 'uk', 'tr', 'th', 'hi', 'qme', 'qht', 'nl', 'fi', 'ar', 'zh' and 'pt'. For the purpose of the information integrity tweets are separated and stored in different files ordered by creation date. The provided dataset is shared for further research purposes. Additionally, we provide the list of tweets IDs at the GitHub repository which can be retracted via Twitter API. Furthermore, we also manage to execute some initial analysis including: volume/activity, hashtags popularity, sentiment and military intelligence and publish the results in the web portal.
Z
Antisemitism on Twitter: A Dataset for Machine Learning and Text Analytics
data.niaid.nih.gov
zenodo.org
Updated Dec 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Soemer, Katharina (2024). Antisemitism on Twitter: A Dataset for Machine Learning and Text Analytics [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7872834
Explore at:
Dataset updated
Dec 13, 2024
Dataset provided by
Jikeli, Gunther
Karali, Sameer
Soemer, Katharina
Miehling, Daniel
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset from the Institute for the Study of Contemporary Antisemitism (ISCA) at Indiana University:

The Social Media & Hate research lab at the Institute for the Study of Contemporary Antisemitism compiled this dataset using an annotation portal (Jikeli, Soemer, and Karali 2024), which was used to label tweets as either antisemitic or non-antisemitic, among other labels. Note that annotation was done on live data, including images and context, such as threads. All data was annotated by two experts, and all discrepancies were discussed (Jikeli et al. 2023).

Content:

This dataset contains 11311 tweets covering a wide range of topics common in conversations about Jews, Israel, and antisemitism between January 2019 and April 2023. The dataset consists of random samples of relevant keywords during this time period. 1,953 tweets (17%) are antisemitic according to the IHRA definition of antisemitism.

The distribution of tweets by year is as follows: 1499 (13%) from 2019, 3712 (33%) from 2020, 2591 (23%) from 2021, 2644 from 2022 (23%) and 865 (8%) from 2023. 6365 (56%) contain the keyword "Jews," 4134 (37%) include "Israel," 529 (5%) feature the derogatory term "ZioNazi*," and 283 (3%) use the slur "K---s." Some tweets may contain multiple keywords.

725 out of the 6365 tweets with the keyword "Jews" (11%) and 664 out of the 4134 tweets with the keyword "Israel" (16%) were classified as antisemitic. 97 out of the 283 tweets using the antisemitic slur "K---s" (34%) are antisemitic. Interestingly, many tweets featuring the slur "K---s" actually call out its use. In contrast, the majority of tweets using the derogatory term "ZioNazi*" are antisemitic, with 467 out of 529 (88%) being classified as such.

File Description:

The dataset is provided in a csv file format, with each row representing a single message, including replies, quotes, and retweets. The file contains the following columns:

‘ID’: Represents the tweet ID.

‘Username’: Represents the username that posted the tweet.

‘Text’: Represents the full text of the tweet (not pre-processed).

‘CreateDate’: Represents the date on which the tweet was created.

‘Biased’: Represents the label given by our annotations as to whether the tweet is antisemitic or not.

‘Keyword’: Represents the keyword that was used in the query. The keyword can be in the text, including hashtags, mentioned users, or the username itself.

Licences

Data is published under the terms of the "Creative Commons Attribution 4.0 International" licence (https://creativecommons.org/licenses/by/4.0)

Acknowledgements

We are grateful for the support of Indiana University’s Observatory on Social Media (OSoMe) (Davis et al. 2016) and the contributions and annotations of all team members in our Social Media & Hate Research Lab at Indiana University’s Institute for the Study of Contemporary Antisemitism, especially Grace Bland, Elisha S. Breton, Kathryn Cooper, Robin Forstenhäusler, Sophie von Máriássy, Mabel Poindexter, Jenna Solomon, Clara Schilling, and Victor Tschiskale.

This work used Jetstream2 at Indiana University through allocation HUM200003 from the Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support (ACCESS) program, which is supported by National Science Foundation grants #2138259, #2138286, #2138307, #2137603, and #2138296.
h
Famous-Keyword-Twitter-Replies
huggingface.co
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Po-Chuan Chen, Famous-Keyword-Twitter-Replies [Dataset]. https://huggingface.co/datasets/jacksoncsie/Famous-Keyword-Twitter-Replies
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Authors
Po-Chuan Chen
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
The "Famous Keyword Twitter Replies Dataset" is a comprehensive collection of Twitter data that focuses on popular keywords and their associated replies. This dataset contains five essential columns that provide valuable insights into the Twitter conversation dynamics:

Keyword: This column represents the specific keyword or topic of interest that generated the original tweet. It helps identify the context or subject matter around which the conversation revolves.

Main_tweet: The main_tweet… See the full description on the dataset page: https://huggingface.co/datasets/jacksoncsie/Famous-Keyword-Twitter-Replies.
X/Twitter users in the United Kingdom 2019-2028
statista.com
Updated Jan 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista Research Department (2025). X/Twitter users in the United Kingdom 2019-2028 [Dataset]. https://www.statista.com/topics/11843/x-formerly-twitter-in-the-united-kingdom-uk/
Explore at:
Dataset updated
Jan 13, 2025
Dataset provided by
Statistahttp://statista.com/
Authors
Statista Research Department
Area covered
United Kingdom
Description
The number of Twitter users in the United Kingdom was forecast to continuously increase between 2024 and 2028 by in total 0.9 million users (+5.1 percent). After the ninth consecutive increasing year, the Twitter user base is estimated to reach 18.55 million users and therefore a new peak in 2028. Notably, the number of Twitter users of was continuously increasing over the past years.User figures, shown here regarding the platform twitter, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).
f
How Topics Affect Twitter Attention
stemfellowship.figshare.com
png
Updated Feb 5, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Michal Fishkin; Jennifer Ou; Andrew Zhu (2017). How Topics Affect Twitter Attention [Dataset]. http://doi.org/10.6084/m9.figshare.4621009.v1
Explore at:
pngAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.4621009.v1
Dataset updated
Feb 5, 2017
Dataset provided by
STEM Fellowship Big Data Challenge
Authors
Michal Fishkin; Jennifer Ou; Andrew Zhu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The purpose of the investigation conducted was to discover trends in twitter popularity regarding different areas of science. This investigation can benefit areas of marketing such as targeted advertising, as well as demographic research in order to correctly test certain demographics and obtain research grants. Results included possible confirmation of our motive through principal component analysis, The data was compiled using RStudio and was narrowed down by subjects, Altmetric scores, and countries. The data was parsed through to find Key words in the abstracts of articles. Principal Component Analysis was applied to a matrix of padded tweet dates, arranged by subject. These arranged dates were also plotted to visualize trends over time. From the data collected, the articles that were most tweeted about, between January 1st, 2016 to July 1st, 2016, worldwide were articles concerning physics. Out of all the articles, ”death” was the keyword most popular in articles’ abstracts. Disease-related words appeared far more often than the word ”cure”. The United States of America, Canada and Great Britain had the highest number of tweeters. Great Britain’s population was mainly interested in articles regarding dentistry, while Canada and the United States of America had a higher tweet count in articles related to health science.
h
twitter-financial-news-sentiment
huggingface.co
opendatalab.com
Updated Dec 4, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
not a (2022). twitter-financial-news-sentiment [Dataset]. https://huggingface.co/datasets/zeroshot/twitter-financial-news-sentiment
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 4, 2022
Authors
not a
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset Description

The Twitter Financial News dataset is an English-language dataset containing an annotated corpus of finance-related tweets. This dataset is used to classify finance-related tweets for their sentiment.

The dataset holds 11,932 documents annotated with 3 labels:

sentiments = { "LABEL_0": "Bearish", "LABEL_1": "Bullish", "LABEL_2": "Neutral" }

The data was collected using the Twitter API. The current dataset supports the multi-class classification… See the full description on the dataset page: https://huggingface.co/datasets/zeroshot/twitter-financial-news-sentiment.

Facebook

Twitter

Click to copy link

Link copied

Cite

_w1998 (2023). Famous Words Twitter Dataset [Dataset]. https://www.kaggle.com/datasets/jackksoncsie/twitter-dataset-keywords-likes-and-tweets/discussion

Famous Words Twitter Dataset

Twitter Dataset: Keywords, Likes, and Tweets

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

May 30, 2023

Dataset provided by

Kagglehttp://kaggle.com/

Authors

_w1998

License

http://www.gnu.org/licenses/agpl-3.0.htmlhttp://www.gnu.org/licenses/agpl-3.0.html

Description

The Famous Words Twitter Dataset is a comprehensive collection of tweets associated with famous words. The dataset provides valuable insights into the social media engagement and popularity of these words on the Twitter platform. It includes three primary columns: keyword, likes, and tweets.

The keyword column represents the specific famous word or phrase associated with each tweet. It allows researchers and analysts to explore the dynamics of user interactions and discussions surrounding these popular terms on Twitter.

The likes column indicates the number of likes received by each tweet. This metric serves as an indicator of the tweet's popularity and resonation among Twitter users.

The tweet column contains the actual tweet text, capturing the content and context of user-generated messages related to the famous words. This column provides valuable qualitative data for sentiment analysis, topic modeling, and other natural language processing tasks.

Researchers, data scientists, and social media analysts can leverage this dataset to study various aspects, such as tracking trends, sentiment analysis, understanding user engagement patterns, and identifying influential topics associated with famous words on Twitter.

Topics: "COVID-19", "Vaccine", "Zoom", "Bitcoin", "Dogecoin", "NFT", "Elon Musk", "Tesla", "Amazon", "iPhone 12", "Remote work", "TikTok", "Instagram", "Facebook", "YouTube", "Netflix", "GameStop", "Super Bowl", "Olympics", "Black Lives Matter" "India vs England", "Ukraine", "Queen Elizabeth", "World Cup", "Jeffrey Dahmer", "Johnny Depp", "Will Smith", "Weather", "xvideo", "porn", "nba", "Macdonald",

Total has 128837 tweets, and here are the plot for each number of tweets for different keyword

https://i.imgur.com/z4xbbyt.png" alt="">

Note: The dataset is carefully curated, anonymized, and stripped of any personally identifiable information to protect user privacy.

Clear search

Close search

Google apps

Main menu

Famous Words Twitter Dataset

Twitter Dataset

Twitter Tweets Sentiment Dataset

Description:

Columns:

Acknowledgement:

Objective:

Twitter Revenue Growth

Twitter Profiles Dataset

Twitter cascade dataset

Twitter Key Statistics

X/Twitter: Countries with the largest audience 2025

Data from: Early prediction and characterization of high-impact world events...

Twitter Statistics

Twitter Users Broken down By Country

Data from: Twitter Big Data as A Resource For Exoskeleton Research: A...

Twitter Users Broken Down By Age

Twitter users in the United States 2019-2028

Data from: Twitter Dataset on the Russo-Ukrainian War

Antisemitism on Twitter: A Dataset for Machine Learning and Text Analytics

Famous-Keyword-Twitter-Replies

X/Twitter users in the United Kingdom 2019-2028

How Topics Affect Twitter Attention

twitter-financial-news-sentiment

Famous Words Twitter Dataset

Twitter Dataset: Keywords, Likes, and Tweets