100+ datasets found

Global Political tweets
kaggle.com
Updated Aug 23, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kash (2022). Global Political tweets [Dataset]. https://www.kaggle.com/kaushiksuresh147/political-tweets/tasks
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 23, 2022
Dataset provided by
Kaggle
Authors
Kash
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
https://techcrunch.com/wp-content/uploads/2015/10/twitter-politics.png" alt="">

Social media is becoming a key medium through which we communicate with each other: it is at the center of the very structures of our daily interactions. Yet this infiltration is not unique to interpersonal relations. Political leaders, governments, and states operate within this social media environment, wherein they continually address crises and institute damage control through platforms such as Twitter.

With the proliferation of the internet into mass masses, social media is emerging as a potential way of communication. It provides a direct channel to politicians for communicating, connecting, and engaging with the public. The power of social media, especially Twitter and Facebook has been proved by its successful application during recent US presidential elections and Arabian countries' revolts. In India too, as the general election is about to knock at the door during early 2014, political parties and leaders are trying to harness the power of social media.

Content

The tweets have the #Politics hashtag. The collection started on 24/7/2021, and will be updated on a daily basis.

Information regarding the data

The data totally consists of 1 lakh+ records with 13 columns. The description of the features is given below | No |Columns | Descriptions | | -- | -- | -- | | 1 | user_name | The name of the user, as they’ve defined it. | | 2 | user_location | The user-defined location for this account’s profile. | | 3 | user_description | The user-defined UTF-8 string describing their account. | | 4 | user_created | Time and date, when the account was created. | | 5 | user_followers | The number of followers an account currently has. | | 6 | user_friends | The number of friends an account currently has. | | 7 | user_favourites | The number of favorites an account currently has | | 8 | user_verified | When true, indicates that the user has a verified account | | 9 | date | UTC time and date when the Tweet was created | | 10 | text | The actual UTF-8 text of the Tweet | | 11 | hashtags | All the other hashtags posted in the tweet along with #Politics | | 12 | source | Utility used to post the Tweet, Tweets from the Twitter website have a source value - web | | 13 | is_retweet | Indicates whether this Tweet has been Retweeted by the authenticating user. |

Inspiration

You can use this data to dive into the subjects that use this hashtag, look to the geographical distribution, evaluate sentiments, and look at trends.
f
The Twitter Parliamentarian Database
figshare.com
txt
Updated Oct 27, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Livia van Vliet (2023). The Twitter Parliamentarian Database [Dataset]. http://doi.org/10.6084/m9.figshare.10120685.v3
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.10120685.v3
Dataset updated
Oct 27, 2023
Dataset provided by
figshare
Authors
Livia van Vliet
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This is the Twitter Parliamentarian Database: a database consisting of parliamentarian names, parties and twitter ids from the following countries: Austria, Belgium, France, Denmark, Spain, Finland, Germany, Greece, Italy, Malta, Poland, Netherlands, United Kingdom, Ireland, Sweden, New Zealand, Turkey, United States, Canada, Australia, Iceland, Norway, Switzerland, Luxembourg, Latvia and Slovenia. In addition, the database includes the European Parliament.The tweet ids from the politicans' tweets have been collected from September 2017 - 31 October 2019 (all_tweet_ids.csv). In compliance with Twitter's policy, we only store tweet ids, which can be re-hydrated into full tweets using existing tools. More information on how to use the database can be found in the readme.txt.It is recommended that you use the .csv files to work with the data, rather than the SQL tables. Information on the relations in the SQL database can be found in the Database codebook.pdf.Update:The tweet ids for 2021 have been added as '2021.csv'Update #2:The tweet ids for 2020 have been added as '2020.csv'The last party table has been added as 'parties_2021_04_28.csv'The last members table has been added as 'members_2021_04_28.csv'
o
Public Political Opinion on Twitter
opendatabay.com
.undefined
Updated Jul 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Datasimple (2025). Public Political Opinion on Twitter [Dataset]. https://www.opendatabay.com/data/ai-ml/c8d2d199-5c65-401a-8d9d-c88bd5471489
Explore at:
.undefinedAvailable download formats
Dataset updated
Jul 5, 2025
Dataset authored and provided by
Datasimple
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
Social Media and Networking
Description
This dataset captures a vast collection of social media discourse related to global politics. It highlights how social media has become a crucial medium for communication, especially for political leaders, governments, and states engaging with the public. The data illustrates the role of platforms like Twitter in addressing crises and managing public perception. The power of social media, particularly Twitter, has been demonstrated in significant political events such as recent US presidential elections and revolts in Arabian countries. The collection focuses on tweets containing the #Politics hashtag, with daily updates ensuring relevance and recency.

Columns

user_name: The name defined by the user for their account.

user_location: The user-defined location associated with the account's profile.

user_description: The user-defined text string that describes their account.

user_created: The date and time when the user's account was created.

user_followers: The current number of followers an account has.

user_friends: The current number of accounts a user is following.

user_favourites: The number of tweets an account has marked as favourites.

user_verified: A boolean indicator; 'true' if the user has a verified account.

date: The UTC date and time when the Tweet was created.

text: The actual UTF-8 text content of the Tweet.

hashtags: All other hashtags included in the tweet, in addition to #Politics.

source: The utility or platform used to post the Tweet (e.g., 'web' for tweets from the Twitter website).

is_retweet: Indicates whether the Tweet has been Retweeted by the authenticating user.

Distribution

The dataset is typically provided in a CSV format. It contains over 100,000 records, specifically, the collection contains 238,527 unique values within the observed date ranges. Each record includes 13 distinct columns detailing various aspects of the political tweets and their originators. Specific numbers for rows/records are available from the detailed time series counts, reflecting data points ranging from hundreds to tens of thousands per period.

Usage

This dataset is ideal for exploratory data analysis, allowing users to dive into the subjects associated with the #Politics hashtag. It can be used to analyse geographical distribution of political discourse, evaluate sentiments expressed in tweets, and identify emerging trends in political conversations on social media. Researchers and analysts can gain insights into public opinion, political communication strategies, and the impact of social media on political landscapes.

Coverage

The dataset's geographic scope is global. The current collection began on 24th July 2021 and is updated on a daily basis. Data spans from this start date up to 21st August 2022 based on current observations. Some historical aggregations of #Politics tweets are available for periods as early as 14th July 2006. No specific notes on data availability for certain demographic groups are provided, but the user and tweet metadata allows for some inferred demographic analysis.

License

CC0

Who Can Use It

This dataset is valuable for a wide range of users, including: * Political scientists and researchers studying political communication and social media's impact on public discourse. * Data scientists and analysts keen on performing social media analysis, including sentiment analysis and topic modelling. * Natural Language Processing (NLP) practitioners developing models for text classification, entity recognition, or language understanding in a political context. * Organisations and individuals interested in monitoring political trends and public engagement on social media platforms.

Dataset Name Suggestions

Global Political Tweets

Political Twitter Discourse

#Politics Hashtag Tweets

Social Media Political Activity

Public Political Opinion on Twitter

Attributes

Original Data Source: Global Political tweets
P
twitter politicians data Dataset
paperswithcode.com
opendatalab.com
Updated May 13, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Padmaja Jonnalagedda; Brent Weinberg; Jason Allen; Taejin L. Min; Shiv Bhanu; Bir Bhanu (2020). twitter politicians data Dataset [Dataset]. https://paperswithcode.com/dataset/twitter-politicians-data
Explore at:
Dataset updated
May 13, 2020
Authors
Padmaja Jonnalagedda; Brent Weinberg; Jason Allen; Taejin L. Min; Shiv Bhanu; Bir Bhanu
Description
Dataset based on Twitter usernames of American politicians. Data extracted from Wikidata.

The same politician can appear several times: if he has different pseudonyms on Twitter or Instagram, if he has been in several parties, or if several Twitter account IDs are associated with him. But the data is sorted in ascending order by name, so it is visible
d
Data from: Supersharers of fake news on Twitter
dataone.org
data.niaid.nih.gov
+2more
Updated May 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sahar Baribi-Bartov; Briony Swire-Thompson; Nir Grinberg (2024). Supersharers of fake news on Twitter [Dataset]. http://doi.org/10.5061/dryad.44j0zpcmq
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.44j0zpcmq
Dataset updated
May 25, 2024
Dataset provided by
Dryad Digital Repository
Authors
Sahar Baribi-Bartov; Briony Swire-Thompson; Nir Grinberg
Time period covered
Jan 1, 2024
Description
Governments may have the capacity to flood social media with fake news, but little is known about the use of flooding by ordinary voters. In this work, we identify 2107 registered US voters that account for 80% of the fake news shared on Twitter during the 2020 US presidential election by an entire panel of 664,391 voters. We find that supersharers are important members of the network, reaching a sizable 5.2% of registered voters on the platform. Supersharers have a significant overrepresentation of women, older adults, and registered Republicans. Supersharers' massive volume does not seem automated but is rather generated through manual and persistent retweeting. These findings highlight a vulnerability of social media for democracy, where a small group of people distort the political reality for many., This dataset contains aggregated information necessary to replicate the results reported in our work on Supersharers of Fake News on Twitter while respecting and preserving the privacy expectations of individuals included in the analysis. No individual-level data is provided as part of this dataset.Â The data collection process that enabled the creation of this dataset leveraged a large-scale panel of registered U.S. voters matched to Twitter accounts. We examined the activity of 664,391 panel members who were active on Twitter during the months of the 2020 U.S. presidential election (August to November 2020, inclusive), and identified a subset of 2,107 supersharers, which are the most prolific sharers of fake news in the panel that together account for 80% of fake news content shared on the platform. We rely on a source-level definition of fake news, that uses the manually-labeled list of fake news sites by Grinberg et al. 2019 and an updated list based on NewsGuard ratings (commercial..., , # Supersharers of Fake News on Twitter

This repository contains data and code for replication of the results presented in the paper.

The folders are mostly organized by research questions as detailed below. Each folder contains the code and publicly available data necessary for the replication of results. Importantly, no individual-level data is provided as part of this repository. De-identified individual-level data can be attained for IRB-approved uses under the terms and conditions specified in the paper. Once access is granted, the restricted-access data is expected to be located under ./restricted_data.

The folders in this repository are the following:

Preprocessing

Code under the preprocessing folder contains the following:

source classifier - the code used to train a classifier based on NewsGuard domain flags to match the fake news labels source definition use in Grinberg et el. 2019 labels.

political classifier - the code used to identify political tweets, i...
d
Replication Data for: The presence of problematic information and users on...
search.dataone.org
Updated Nov 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Groen, Maarten (2023). Replication Data for: The presence of problematic information and users on Twitter in the run-up to the 2020 U.S. Elections [Dataset]. http://doi.org/10.7910/DVN/QIJQ3X
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/QIJQ3X
Dataset updated
Nov 22, 2023
Dataset provided by
Harvard Dataverse
Authors
Groen, Maarten
Description
Tweet IDs from political Twitter, March 2020 This dataset contains the Twitter tweet IDs used in a study investigating the extent to which problematic information is present in the most engaged-with content in political and issue spaces on Twitter in the run-up to the 2020 US elections. These tweets were returned from running in DMI-TCAT a curated list of queries for political candidates, political parties and social issues, incorporating politician-specific, party-specific and issue-specific keywords and hashtags. The shared URLs dataset was collected during a three-week timeframe (March 2-22, 2020, or around Super Tuesday) and contains only the tweet IDs that contain a URL.
f
Data for right-wing groups found on Twitter around the 2016 US election
royalholloway.figshare.com
txt
Updated Oct 25, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Bryden; Eric Silverman (2018). Data for right-wing groups found on Twitter around the 2016 US election [Dataset]. http://doi.org/10.17637/rh.7160027.v2
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.17637/rh.7160027.v2
Dataset updated
Oct 25, 2018
Dataset provided by
Royal Holloway, University of London
Authors
John Bryden; Eric Silverman
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United States
Description
File contains a list of Twitter account IDs in ASCII format. These accounts were those which we sampled and then analysed in the paper. The data we used are available from Twitter with the REST API.
Data from: Data used to develop #Polar scores
figshare.com
bin
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Culotta, A.; Libby Hemphill; Heston, M. (2023). Data used to develop #Polar scores [Dataset]. http://doi.org/10.6084/m9.figshare.3409927.v3
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.3409927.v3
Dataset updated
May 31, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Culotta, A.; Libby Hemphill; Heston, M.
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We present a new approach to measuring political polarization, including a novel algorithm and open source Python code, which leverages Twitter content to produce measures of polarization for both users and hashtags. #Polar scores provide advantages over existing measures because they (1) can be calculated throughout the legislative cycle, (2) allow for easy differentiation between users with similar scores, (3) are chamber-agnostic, and (4) are a generic approach that can be applied beyond the U.S. Congress. #Polar scores leverage available information such as party labels, word frequency, and hashtags to create an accessible, straightforward algorithm for estimating polarity using text. (from the paper: Hemphill, L., Culotta, A., and Heston, M. (forthcoming) #Polar Scores: Measuring partisanship using social media content. Journal of Information Technology & Politics.)The dataset contains one plain text TSV file with the following information for each of the 55,244 tweets used to develop #Polar scores : tweet_id, created_at, user_id, screen_name, tag, shortid, sex, party, state, chamber, name. The file contains one row per hashtag, and therefore tweets may appear more than once. The Python code for calculating #Polar scores is available here: http://doi.org/10.5281/zenodo.53888
o
Joe Biden Social Media Activity Dataset
opendatabay.com
.undefined
Updated Jul 5, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Datasimple (2025). Joe Biden Social Media Activity Dataset [Dataset]. https://www.opendatabay.com/data/ai-ml/9aca91ad-62ea-4fe6-8c06-a5baf8d04e6b
Explore at:
.undefinedAvailable download formats
Dataset updated
Jul 5, 2025
Dataset authored and provided by
Datasimple
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
Social Media and Networking
Description
This dataset provides a detailed collection of tweets from Joe Biden's official Twitter account, @JoeBiden. It covers a significant period from 24th October, 2007, up to 31st October, 2020. The purpose of this dataset is to offer an invaluable resource for researchers, analysts, and anyone interested in tracking political communication, social media trends, and public sentiment over time. It offers direct insights into his public statements and engagements during his tenure as Vice President and his presidential campaign.

Columns

id: The unique identifier for each tweet on Twitter.

timestamp: The exact time and date the tweet was posted, recorded in Greenwich Mean Time (GMT).

url: The direct URL link to the specific tweet on Twitter.

tweet: The full text content of the tweet.

replies: The number of replies received by the tweet.

retweets: The number of times the tweet was retweeted by other users.

quotes: The number of times the tweet was quoted by other users.

likes: The total number of likes (or 'favourites') the tweet received.

Distribution

The dataset is typically provided in a CSV file format. It comprises 6,064 unique records, each representing a single tweet from Joe Biden's Twitter handle. The data spans from late 2007 to late 2020.

Usage

This dataset is ideal for various applications and use cases, including: * Political Analysis: Studying the evolution of Joe Biden's political messaging and public discourse. * Social Media Trend Analysis: Identifying patterns in social media engagement and public response to political figures. * Natural Language Processing (NLP): Training models for sentiment analysis, topic modelling, and text classification on political language. * Historical Research: Providing a digital archive for historians and researchers examining contemporary political communication. * Journalism: Fact-checking and providing context for news stories related to Joe Biden's past statements.

Coverage

The dataset's coverage is global, reflecting the worldwide accessibility of Twitter. The time range is precisely from 24th October, 2007, to 31st October, 2020. The content specifically focuses on tweets originating from Joe Biden's official Twitter handle, @JoeBiden, without specific demographic targeting or limitations beyond the nature of a public figure's Twitter feed.

License

CC0

Who Can Use It

Academic Researchers: For studies on political science, communication, and digital humanities.

Data Scientists and Analysts: For developing and testing algorithms related to text analysis and social media metrics.

Journalists: For investigative reporting and historical context.

AI and LLM Developers: For creating and refining language models with real-world political discourse.

Students: As a practical resource for projects in data analysis, computer science, and political studies.

Dataset Name Suggestions

Joe Biden Tweets Archive

Joe Biden's Official Twitter Data (2007-2020)

US Political Tweets: Joe Biden

Biden's Digital Footprint on Twitter

Joe Biden Social Media Activity Dataset

Attributes

Original Data Source: Joe Biden Tweets (2007 - 2020)
m
Mexican Political Twitter Dataset (2018 Presidential Election)
data.mendeley.com
Updated Apr 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Carlos Piña (2025). Mexican Political Twitter Dataset (2018 Presidential Election) [Dataset]. http://doi.org/10.17632/j4pxzxpkc3.1
Explore at:
Unique identifier
https://doi.org/10.17632/j4pxzxpkc3.1
Dataset updated
Apr 21, 2025
Authors
Carlos Piña
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Mexico
Description
This dataset contains Twitter data collected during the 2018 Mexican presidential election campaign, focusing on mentions and tweets related to the main presidential candidates (@JoseAMeadeK, @RicardoAnayaC, @lopezobrador_, and @JaimeRdzNL). It represents a sample of 10,000 tweets from a larger dataset gathered as part of the research project "In-Context Learning for Misinformation Detection: Detecting Political Propaganda on Twitter Mexico using Large Language Model Meta AI".

The dataset includes the following fields: tweet_id: Unique identifier for each tweet followers_count: Number of followers of the user who posted the tweet created_at: Original timestamp of tweet creation (UTC) local_time: Timestamp converted to Mexico City time zone tweet: Text content of the tweet source: Platform or application used to post the tweet

This sample dataset was collected using the Twitter's streaming API in 2018. The script filtered the global Twitter stream for mentions of Mexico's presidential candidates.

Several fields present in the original data collection have been removed from this sample to comply with Twitter's terms of service and to protect user privacy:

Username (screen_name) Tweet URL Geographical coordinates User location information

Only publicly accessible tweets (those without privacy restrictions set by users) were collected in the original dataset.

This dataset serves as a sample to provide insights into the larger research project focusing on misinformation detection and political propaganda analysis in Mexican social media during the 2018 presidential campaign. The research applies large language models to detect patterns of misinformation in political discourse.
Twitter activity data for Twitter-based analysis of the dynamics of...
figshare.com
application/gzip
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Young-ho Eom (2023). Twitter activity data for Twitter-based analysis of the dynamics of collective attention to political parties (PLoS ONE, 2015) [Dataset]. http://doi.org/10.6084/m9.figshare.1437740.v1
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.1437740.v1
Dataset updated
Jun 1, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Young-ho Eom
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This data-set contain dynamics of Twitter activity on political parties. Please see the main paper "Twitter-based analysis of the dynamics of collective attention to political parties (PLoS ONE, 2015)" for the details.
d
Replication Data for: Violent Political Rhetoric on Twitter
search.dataone.org
Updated Nov 12, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kim, Tae Gyoon (2023). Replication Data for: Violent Political Rhetoric on Twitter [Dataset]. http://doi.org/10.7910/DVN/NEC17Z
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/NEC17Z
Dataset updated
Nov 12, 2023
Dataset provided by
Harvard Dataverse
Authors
Kim, Tae Gyoon
Description
copy directly from abstract in PSRM publication. Visit https://dataone.org/datasets/sha256%3Ab55eab8750f6f4b758b0db7bbe0297a5da783ad5f9d3d6c11e4670ee355d4ce7 for complete metadata about this dataset.
f
Data for right-wing groups found on Twitter around the 2016 US election
figshare.com
txt
Updated Oct 5, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Bryden; Eric Silverman (2018). Data for right-wing groups found on Twitter around the 2016 US election [Dataset]. http://doi.org/10.17637/rh.7160027.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.17637/rh.7160027.v1
Dataset updated
Oct 5, 2018
Dataset provided by
Royal Holloway, University of London
Authors
John Bryden; Eric Silverman
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United States
Description
Anonymised raw data downloaded from Twitter. All twitter ids have been encrypted so will not work.The .tgz files each contain a .json file containing user account data. In the files there is a json line for each account, which includes the id, its creation date, a list of ids of accounts followed (followids), and a list of ids for accounts which follow that account (friendids).The file groupMembers.json has a json line for each group of accounts found. This includes the id for the group, and a list of the ids of its members.The file groupDescriptions.json has a json line for each group of accounts found. This includes the id for the group and a list of the unusual words found for each group.
[Tweets] 2022 Brazilian Presidential Elections
zenodo.org
zip
Updated Feb 7, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lucas Raniére Juvino Santos; Lucas Raniére Juvino Santos; Leandro Balby Marinho; Leandro Balby Marinho; Claudio Campelo; Claudio Campelo (2025). [Tweets] 2022 Brazilian Presidential Elections [Dataset]. http://doi.org/10.5281/zenodo.14834669
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.14834669
Dataset updated
Feb 7, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Lucas Raniére Juvino Santos; Lucas Raniére Juvino Santos; Leandro Balby Marinho; Leandro Balby Marinho; Claudio Campelo; Claudio Campelo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Aug 1, 2022
Area covered
Brazil
Description
2022 Brazilian Presidential Election

This dataset contains 7,015,186 tweets from 951,602 users, extracted using 91 search terms over 36 days between August 1st and December 31st, 2022.

All tweets in this dataset are in Brazilian Portuguese.

Data Usage

The dataset contains textual data from tweets, making it suitable for various NLP analyses, such as sentiment analysis, bias or stance detection, and toxic language detection. Additionally, users and tweets can be linked to create social graphs, enabling Social Network Analysis (SNA) to study polarization, communities, and other social dynamics.

Extraction Method

This data set was extracted using Twitter's (now X) official API—when Academic Research API access was still available—following the pipeline:

1. Twitter/X daily monitoring: The dataset author monitored daily political events appearing in Brazil's Trending Topics. Twitter/X has an automated system for classifying trending terms. When a term was identified as political, it was stored along with its date for later use as a search query.

2. Tweet collection using saved search terms: Once terms and their corresponding dates were recorded, tweets were extracted from 12:00 AM to 11:59 PM on the day the term entered the Trending Topics. A language filter was applied to select only tweets in Portuguese. The extraction was performed using the official Twitter/X API.

3. Data storage: The extracted data was organized by day and search term. If the same search term appeared in Trending Topics on consecutive days, a separate file was stored for each respective day.

Further Information

For more details, visit:

- The repository
- Dataset short paper:

---

DOI: 10.5281/zenodo.14834669
d
Data for : Reconstruction of the socio-semantic dynamics of political...
search.dataone.org
dataverse.harvard.edu
Updated Nov 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gaumont, Noé; Panahi, Maziyar; Chavalarias, David (2023). Data for : Reconstruction of the socio-semantic dynamics of political activist Twitter networks - Method and application to the 2017 French presidential election [Dataset]. http://doi.org/10.7910/DVN/AOGUIA
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/AOGUIA
Dataset updated
Nov 22, 2023
Dataset provided by
Harvard Dataverse
Authors
Gaumont, Noé; Panahi, Maziyar; Chavalarias, David
Time period covered
Jul 1, 2016 - May 8, 2017
Description
These are the data related to the PLOS ONE paper : Gaumont N, Panahi M, Chavalarias D (2018) Reconstruction of the socio-semantic dynamics of political activist Twitter networks—Method and application to the 2017 French presidential election. PLoS ONE 13(9): e0201879. https://doi.org/10.1371/journal.pone.0201879 This paper proposes an integrated methodology for the data collection, the reconstruction and the visualization of the development of a country political environment from Twitter data. These data cover several aspects of the analysis of the 2017 French presidential campaign election from the perspective of Twitter processing of the Twitter data: intermediary results processed on the tweets dataset (for example text-mining results), additional data from the candidates' programs. Additional information are given in the Supporting information texts.
Political Tweets Dataset
brightdata.com
.json, .csv, .xlsx
Updated Dec 23, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bright Data (2025). Political Tweets Dataset [Dataset]. https://brightdata.com/products/datasets/twitter/tweets/political
Explore at:
.json, .csv, .xlsxAvailable download formats
Dataset updated
Dec 23, 2024
Dataset authored and provided by
Bright Datahttps://brightdata.com/
License
https://brightdata.com/licensehttps://brightdata.com/license
Area covered
Worldwide
Description
Utilize our Political Tweets dataset to enhance campaign strategies and gain insights into public discourse. This dataset offers a comprehensive view of political dynamics on social media, empowering organizations, researchers, and policymakers to analyze trends and sentiment. Access the full dataset or customize it with specific data points tailored to your needs. Popular use cases include: Sentiment Analysis: Analyze publicly available political tweets to understand public sentiment on policies, events, and candidates, aiding campaign strategies and opinion research. Trend Monitoring: Track trending topics and hashtags in political discourse to identify key issues and shifts in public priorities across demographics. Misinformation Detection: Detect and analyze patterns of misinformation, supporting efforts to combat its spread effectively. Harness these insights to stay informed and adapt to the evolving political landscape.
d
Replication Data for: \"Most users do not engage with political elites on...
search.dataone.org
Updated Nov 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Casas, Andreu (2023). Replication Data for: \"Most users do not engage with political elites on Twitter; Those who do, show overwhelming preferences for ideological congruity\" [Dataset]. http://doi.org/10.7910/DVN/G5APVR
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/G5APVR
Dataset updated
Nov 8, 2023
Dataset provided by
Harvard Dataverse
Authors
Casas, Andreu
Description
This repository contains the replication material of the article "Most users do not engage with political elites on Twitter; Those who do, show overwhelming preferences for ideological congruity", to be published at Sience Advances, by Magdalena Wojcieszak, Andreu Casas, Xudong Yu, Jonathan Nagler, and Joshua Tucker. One of the datasets is too large for Dataverse (ingroup-sharing-model-data.csv), you can find a copy in this Google Drive: https://drive.google.com/drive/folders/1EYqaSF-EukTGhanogqaevSnjn7l0koEH?usp=sharing. You can also clone and use this code/data via GitHub: https://github.com/CasAndreu/ingroup_filtering
H
Twitter data on political debates about the Italian immigration policies
dataverse.harvard.edu
pdf, tsv
Updated Jun 8, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Harvard Dataverse (2020). Twitter data on political debates about the Italian immigration policies [Dataset]. http://doi.org/10.7910/DVN/CUWN54
Explore at:
tsv(115906251), pdf(20581)Available download formats
Unique identifier
https://doi.org/10.7910/DVN/CUWN54
Dataset updated
Jun 8, 2020
Dataset provided by
Harvard Dataverse
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Statistics say that Twitter is the preferred social network by journalists in Europe. This means that it provides a perfect environment for the study on debates about Society, Economics, and Politics. The dataset depicting the retweeting activity on an Italian debate regarding immigration policies over a period of one month (from January 2019, 23rd to February 2019, 22nd). The dataset is labeled according to the boticity score of the users participating in the discussion, as outcome of Botometer, a popular bot detector. All the accounts have been classified either as human-operated or as bots. Due to Twitter developers terms we can only provide ids for users and tweets, that can be used to retrieve the original data through the Twitter API. For additional details please refer to "Twitter data on political debates about the Italian immigration policies". (currently under submission to CIKM 2020 resource papers)
d
Replication Data : Twitter and the Projection of Political Personalities in...
search.dataone.org
Updated Sep 24, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jumle, Vihang; Karthik KR, Vignesh; Jumle, Vedant (2024). Replication Data : Twitter and the Projection of Political Personalities in India [Dataset]. http://doi.org/10.7910/DVN/ATGGJR
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/ATGGJR
Dataset updated
Sep 24, 2024
Dataset provided by
Harvard Dataverse
Authors
Jumle, Vihang; Karthik KR, Vignesh; Jumle, Vedant
Description
Data with codes. Visit https://dataone.org/datasets/sha256%3A6f480cbcc96c839ab0850f72b2233b504d64bdb0537e68afb8dec5a564504995 for complete metadata about this dataset.
H
Twitter data on 2020 U.S. Presidential Election Debates
dataverse.harvard.edu
Updated Oct 10, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Manuel Pratelli; Fabio Saracco; Marinella Petrocchi; Rocco De Nicola (2022). Twitter data on 2020 U.S. Presidential Election Debates [Dataset]. http://doi.org/10.7910/DVN/ANBPTC
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/ANBPTC
Dataset updated
Oct 10, 2022
Dataset provided by
Harvard Dataverse
Authors
Manuel Pratelli; Fabio Saracco; Marinella Petrocchi; Rocco De Nicola
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
United States
Description
The dataset contains Twitter traffic related to the 2020 US pre-election debate. The data was collected considering two types of states, namely swing and safe. The term 'swing' refers to those states in which one cannot be certain of a landslide victory for either Republicans or Democrats, as there is no clear historical orientation of the electorate. In contrast to swing, a state is defined as 'safe' if its citizens have traditionally always elected representatives of the same political party. In particular, the tweets present are from four swing states (i.e., Arizona, Florida, Michigan e Pennsylvania) and four safe states (i.e., New Jersey, Indiana, Washington and Louisiana). The dataset was used in the study "Swinging in the States: Does disinformation on Twitter mirror the US presidential election system?". (under submission).

Facebook

Twitter

Click to copy link

Link copied

Cite

Kash (2022). Global Political tweets [Dataset]. https://www.kaggle.com/kaushiksuresh147/political-tweets/tasks

Global Political tweets

Tweets across the globe with trending #Politics hashtag

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Aug 23, 2022

Dataset provided by

Kaggle

Authors

Kash

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

https://techcrunch.com/wp-content/uploads/2015/10/twitter-politics.png" alt="">

Social media is becoming a key medium through which we communicate with each other: it is at the center of the very structures of our daily interactions. Yet this infiltration is not unique to interpersonal relations. Political leaders, governments, and states operate within this social media environment, wherein they continually address crises and institute damage control through platforms such as Twitter.
With the proliferation of the internet into mass masses, social media is emerging as a potential way of communication. It provides a direct channel to politicians for communicating, connecting, and engaging with the public. The power of social media, especially Twitter and Facebook has been proved by its successful application during recent US presidential elections and Arabian countries' revolts. In India too, as the general election is about to knock at the door during early 2014, political parties and leaders are trying to harness the power of social media.

Content

The tweets have the #Politics hashtag. The collection started on 24/7/2021, and will be updated on a daily basis.

Information regarding the data

Inspiration

You can use this data to dive into the subjects that use this hashtag, look to the geographical distribution, evaluate sentiments, and look at trends.

Clear search

Close search

Google apps

Main menu

Global Political tweets

Content

Information regarding the data

Inspiration

The Twitter Parliamentarian Database

Public Political Opinion on Twitter

Columns

Distribution

Usage

Coverage

License

Who Can Use It

Dataset Name Suggestions

Attributes

twitter politicians data Dataset

Data from: Supersharers of fake news on Twitter

Preprocessing

Replication Data for: The presence of problematic information and users on...

Data for right-wing groups found on Twitter around the 2016 US election

Data from: Data used to develop #Polar scores

Joe Biden Social Media Activity Dataset

Columns

Distribution

Usage

Coverage

License

Who Can Use It

Dataset Name Suggestions

Attributes

Mexican Political Twitter Dataset (2018 Presidential Election)

Twitter activity data for Twitter-based analysis of the dynamics of...

Replication Data for: Violent Political Rhetoric on Twitter

Data for right-wing groups found on Twitter around the 2016 US election

[Tweets] 2022 Brazilian Presidential Elections

2022 Brazilian Presidential Election

Data Usage

Extraction Method

Further Information

Data for : Reconstruction of the socio-semantic dynamics of political...

Political Tweets Dataset

Replication Data for: \"Most users do not engage with political elites on...

Twitter data on political debates about the Italian immigration policies

Replication Data : Twitter and the Projection of Political Personalities in...

Twitter data on 2020 U.S. Presidential Election Debates

Global Political tweets

Tweets across the globe with trending #Politics hashtag

Content

Information regarding the data

Inspiration