100+ datasets found
  1. Global Political tweets

    • kaggle.com
    Updated Aug 23, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kash (2022). Global Political tweets [Dataset]. https://www.kaggle.com/kaushiksuresh147/political-tweets/tasks
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 23, 2022
    Dataset provided by
    Kaggle
    Authors
    Kash
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    https://techcrunch.com/wp-content/uploads/2015/10/twitter-politics.png" alt="">

    • Social media is becoming a key medium through which we communicate with each other: it is at the center of the very structures of our daily interactions. Yet this infiltration is not unique to interpersonal relations. Political leaders, governments, and states operate within this social media environment, wherein they continually address crises and institute damage control through platforms such as Twitter.

    • With the proliferation of the internet into mass masses, social media is emerging as a potential way of communication. It provides a direct channel to politicians for communicating, connecting, and engaging with the public. The power of social media, especially Twitter and Facebook has been proved by its successful application during recent US presidential elections and Arabian countries' revolts. In India too, as the general election is about to knock at the door during early 2014, political parties and leaders are trying to harness the power of social media.

    Content

    The tweets have the #Politics hashtag. The collection started on 24/7/2021, and will be updated on a daily basis.

    Information regarding the data

    The data totally consists of 1 lakh+ records with 13 columns. The description of the features is given below | No |Columns | Descriptions | | -- | -- | -- | | 1 | user_name | The name of the user, as they’ve defined it. | | 2 | user_location | The user-defined location for this account’s profile. | | 3 | user_description | The user-defined UTF-8 string describing their account. | | 4 | user_created | Time and date, when the account was created. | | 5 | user_followers | The number of followers an account currently has. | | 6 | user_friends | The number of friends an account currently has. | | 7 | user_favourites | The number of favorites an account currently has | | 8 | user_verified | When true, indicates that the user has a verified account | | 9 | date | UTC time and date when the Tweet was created | | 10 | text | The actual UTF-8 text of the Tweet | | 11 | hashtags | All the other hashtags posted in the tweet along with #Politics | | 12 | source | Utility used to post the Tweet, Tweets from the Twitter website have a source value - web | | 13 | is_retweet | Indicates whether this Tweet has been Retweeted by the authenticating user. |

    Inspiration

    You can use this data to dive into the subjects that use this hashtag, look to the geographical distribution, evaluate sentiments, and look at trends.

  2. f

    The Twitter Parliamentarian Database

    • figshare.com
    txt
    Updated Oct 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Livia van Vliet (2023). The Twitter Parliamentarian Database [Dataset]. http://doi.org/10.6084/m9.figshare.10120685.v3
    Explore at:
    txtAvailable download formats
    Dataset updated
    Oct 27, 2023
    Dataset provided by
    figshare
    Authors
    Livia van Vliet
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This is the Twitter Parliamentarian Database: a database consisting of parliamentarian names, parties and twitter ids from the following countries: Austria, Belgium, France, Denmark, Spain, Finland, Germany, Greece, Italy, Malta, Poland, Netherlands, United Kingdom, Ireland, Sweden, New Zealand, Turkey, United States, Canada, Australia, Iceland, Norway, Switzerland, Luxembourg, Latvia and Slovenia. In addition, the database includes the European Parliament.The tweet ids from the politicans' tweets have been collected from September 2017 - 31 October 2019 (all_tweet_ids.csv). In compliance with Twitter's policy, we only store tweet ids, which can be re-hydrated into full tweets using existing tools. More information on how to use the database can be found in the readme.txt.It is recommended that you use the .csv files to work with the data, rather than the SQL tables. Information on the relations in the SQL database can be found in the Database codebook.pdf.Update:The tweet ids for 2021 have been added as '2021.csv'Update #2:The tweet ids for 2020 have been added as '2020.csv'The last party table has been added as 'parties_2021_04_28.csv'The last members table has been added as 'members_2021_04_28.csv'

  3. o

    Public Political Opinion on Twitter

    • opendatabay.com
    .undefined
    Updated Jul 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Datasimple (2025). Public Political Opinion on Twitter [Dataset]. https://www.opendatabay.com/data/ai-ml/c8d2d199-5c65-401a-8d9d-c88bd5471489
    Explore at:
    .undefinedAvailable download formats
    Dataset updated
    Jul 5, 2025
    Dataset authored and provided by
    Datasimple
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Social Media and Networking
    Description

    This dataset captures a vast collection of social media discourse related to global politics. It highlights how social media has become a crucial medium for communication, especially for political leaders, governments, and states engaging with the public. The data illustrates the role of platforms like Twitter in addressing crises and managing public perception. The power of social media, particularly Twitter, has been demonstrated in significant political events such as recent US presidential elections and revolts in Arabian countries. The collection focuses on tweets containing the #Politics hashtag, with daily updates ensuring relevance and recency.

    Columns

    • user_name: The name defined by the user for their account.
    • user_location: The user-defined location associated with the account's profile.
    • user_description: The user-defined text string that describes their account.
    • user_created: The date and time when the user's account was created.
    • user_followers: The current number of followers an account has.
    • user_friends: The current number of accounts a user is following.
    • user_favourites: The number of tweets an account has marked as favourites.
    • user_verified: A boolean indicator; 'true' if the user has a verified account.
    • date: The UTC date and time when the Tweet was created.
    • text: The actual UTF-8 text content of the Tweet.
    • hashtags: All other hashtags included in the tweet, in addition to #Politics.
    • source: The utility or platform used to post the Tweet (e.g., 'web' for tweets from the Twitter website).
    • is_retweet: Indicates whether the Tweet has been Retweeted by the authenticating user.

    Distribution

    The dataset is typically provided in a CSV format. It contains over 100,000 records, specifically, the collection contains 238,527 unique values within the observed date ranges. Each record includes 13 distinct columns detailing various aspects of the political tweets and their originators. Specific numbers for rows/records are available from the detailed time series counts, reflecting data points ranging from hundreds to tens of thousands per period.

    Usage

    This dataset is ideal for exploratory data analysis, allowing users to dive into the subjects associated with the #Politics hashtag. It can be used to analyse geographical distribution of political discourse, evaluate sentiments expressed in tweets, and identify emerging trends in political conversations on social media. Researchers and analysts can gain insights into public opinion, political communication strategies, and the impact of social media on political landscapes.

    Coverage

    The dataset's geographic scope is global. The current collection began on 24th July 2021 and is updated on a daily basis. Data spans from this start date up to 21st August 2022 based on current observations. Some historical aggregations of #Politics tweets are available for periods as early as 14th July 2006. No specific notes on data availability for certain demographic groups are provided, but the user and tweet metadata allows for some inferred demographic analysis.

    License

    CC0

    Who Can Use It

    This dataset is valuable for a wide range of users, including: * Political scientists and researchers studying political communication and social media's impact on public discourse. * Data scientists and analysts keen on performing social media analysis, including sentiment analysis and topic modelling. * Natural Language Processing (NLP) practitioners developing models for text classification, entity recognition, or language understanding in a political context. * Organisations and individuals interested in monitoring political trends and public engagement on social media platforms.

    Dataset Name Suggestions

    • Global Political Tweets
    • Political Twitter Discourse
    • #Politics Hashtag Tweets
    • Social Media Political Activity
    • Public Political Opinion on Twitter

    Attributes

    Original Data Source: Global Political tweets

  4. P

    twitter politicians data Dataset

    • paperswithcode.com
    • opendatalab.com
    Updated May 13, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Padmaja Jonnalagedda; Brent Weinberg; Jason Allen; Taejin L. Min; Shiv Bhanu; Bir Bhanu (2020). twitter politicians data Dataset [Dataset]. https://paperswithcode.com/dataset/twitter-politicians-data
    Explore at:
    Dataset updated
    May 13, 2020
    Authors
    Padmaja Jonnalagedda; Brent Weinberg; Jason Allen; Taejin L. Min; Shiv Bhanu; Bir Bhanu
    Description

    Dataset based on Twitter usernames of American politicians. Data extracted from Wikidata.

    The same politician can appear several times: if he has different pseudonyms on Twitter or Instagram, if he has been in several parties, or if several Twitter account IDs are associated with him. But the data is sorted in ascending order by name, so it is visible

  5. d

    Data from: Supersharers of fake news on Twitter

    • dataone.org
    • data.niaid.nih.gov
    • +2more
    Updated May 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sahar Baribi-Bartov; Briony Swire-Thompson; Nir Grinberg (2024). Supersharers of fake news on Twitter [Dataset]. http://doi.org/10.5061/dryad.44j0zpcmq
    Explore at:
    Dataset updated
    May 25, 2024
    Dataset provided by
    Dryad Digital Repository
    Authors
    Sahar Baribi-Bartov; Briony Swire-Thompson; Nir Grinberg
    Time period covered
    Jan 1, 2024
    Description

    Governments may have the capacity to flood social media with fake news, but little is known about the use of flooding by ordinary voters. In this work, we identify 2107 registered US voters that account for 80% of the fake news shared on Twitter during the 2020 US presidential election by an entire panel of 664,391 voters. We find that supersharers are important members of the network, reaching a sizable 5.2% of registered voters on the platform. Supersharers have a significant overrepresentation of women, older adults, and registered Republicans. Supersharers' massive volume does not seem automated but is rather generated through manual and persistent retweeting. These findings highlight a vulnerability of social media for democracy, where a small group of people distort the political reality for many., This dataset contains aggregated information necessary to replicate the results reported in our work on Supersharers of Fake News on Twitter while respecting and preserving the privacy expectations of individuals included in the analysis. No individual-level data is provided as part of this dataset. The data collection process that enabled the creation of this dataset leveraged a large-scale panel of registered U.S. voters matched to Twitter accounts. We examined the activity of 664,391 panel members who were active on Twitter during the months of the 2020 U.S. presidential election (August to November 2020, inclusive), and identified a subset of 2,107 supersharers, which are the most prolific sharers of fake news in the panel that together account for 80% of fake news content shared on the platform. We rely on a source-level definition of fake news, that uses the manually-labeled list of fake news sites by Grinberg et al. 2019 and an updated list based on NewsGuard ratings (commercial..., , # Supersharers of Fake News on Twitter

    This repository contains data and code for replication of the results presented in the paper.

    The folders are mostly organized by research questions as detailed below. Each folder contains the code and publicly available data necessary for the replication of results. Importantly, no individual-level data is provided as part of this repository. De-identified individual-level data can be attained for IRB-approved uses under the terms and conditions specified in the paper. Once access is granted, the restricted-access data is expected to be located under ./restricted_data.

    The folders in this repository are the following:

    Preprocessing

    Code under the preprocessing folder contains the following:

    1. source classifier - the code used to train a classifier based on NewsGuard domain flags to match the fake news labels source definition use in Grinberg et el. 2019 labels.
    2. political classifier - the code used to identify political tweets, i...
  6. d

    Replication Data for: The presence of problematic information and users on...

    • search.dataone.org
    Updated Nov 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Groen, Maarten (2023). Replication Data for: The presence of problematic information and users on Twitter in the run-up to the 2020 U.S. Elections [Dataset]. http://doi.org/10.7910/DVN/QIJQ3X
    Explore at:
    Dataset updated
    Nov 22, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Groen, Maarten
    Description

    Tweet IDs from political Twitter, March 2020 This dataset contains the Twitter tweet IDs used in a study investigating the extent to which problematic information is present in the most engaged-with content in political and issue spaces on Twitter in the run-up to the 2020 US elections. These tweets were returned from running in DMI-TCAT a curated list of queries for political candidates, political parties and social issues, incorporating politician-specific, party-specific and issue-specific keywords and hashtags. The shared URLs dataset was collected during a three-week timeframe (March 2-22, 2020, or around Super Tuesday) and contains only the tweet IDs that contain a URL.

  7. f

    Data for right-wing groups found on Twitter around the 2016 US election

    • royalholloway.figshare.com
    txt
    Updated Oct 25, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Bryden; Eric Silverman (2018). Data for right-wing groups found on Twitter around the 2016 US election [Dataset]. http://doi.org/10.17637/rh.7160027.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    Oct 25, 2018
    Dataset provided by
    Royal Holloway, University of London
    Authors
    John Bryden; Eric Silverman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States
    Description

    File contains a list of Twitter account IDs in ASCII format. These accounts were those which we sampled and then analysed in the paper. The data we used are available from Twitter with the REST API.

  8. Data from: Data used to develop #Polar scores

    • figshare.com
    bin
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Culotta, A.; Libby Hemphill; Heston, M. (2023). Data used to develop #Polar scores [Dataset]. http://doi.org/10.6084/m9.figshare.3409927.v3
    Explore at:
    binAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Culotta, A.; Libby Hemphill; Heston, M.
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We present a new approach to measuring political polarization, including a novel algorithm and open source Python code, which leverages Twitter content to produce measures of polarization for both users and hashtags. #Polar scores provide advantages over existing measures because they (1) can be calculated throughout the legislative cycle, (2) allow for easy differentiation between users with similar scores, (3) are chamber-agnostic, and (4) are a generic approach that can be applied beyond the U.S. Congress. #Polar scores leverage available information such as party labels, word frequency, and hashtags to create an accessible, straightforward algorithm for estimating polarity using text. (from the paper: Hemphill, L., Culotta, A., and Heston, M. (forthcoming) #Polar Scores: Measuring partisanship using social media content. Journal of Information Technology & Politics.)The dataset contains one plain text TSV file with the following information for each of the 55,244 tweets used to develop #Polar scores : tweet_id, created_at, user_id, screen_name, tag, shortid, sex, party, state, chamber, name. The file contains one row per hashtag, and therefore tweets may appear more than once. The Python code for calculating #Polar scores is available here: http://doi.org/10.5281/zenodo.53888

  9. o

    Joe Biden Social Media Activity Dataset

    • opendatabay.com
    .undefined
    Updated Jul 5, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Datasimple (2025). Joe Biden Social Media Activity Dataset [Dataset]. https://www.opendatabay.com/data/ai-ml/9aca91ad-62ea-4fe6-8c06-a5baf8d04e6b
    Explore at:
    .undefinedAvailable download formats
    Dataset updated
    Jul 5, 2025
    Dataset authored and provided by
    Datasimple
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Social Media and Networking
    Description

    This dataset provides a detailed collection of tweets from Joe Biden's official Twitter account, @JoeBiden. It covers a significant period from 24th October, 2007, up to 31st October, 2020. The purpose of this dataset is to offer an invaluable resource for researchers, analysts, and anyone interested in tracking political communication, social media trends, and public sentiment over time. It offers direct insights into his public statements and engagements during his tenure as Vice President and his presidential campaign.

    Columns

    • id: The unique identifier for each tweet on Twitter.
    • timestamp: The exact time and date the tweet was posted, recorded in Greenwich Mean Time (GMT).
    • url: The direct URL link to the specific tweet on Twitter.
    • tweet: The full text content of the tweet.
    • replies: The number of replies received by the tweet.
    • retweets: The number of times the tweet was retweeted by other users.
    • quotes: The number of times the tweet was quoted by other users.
    • likes: The total number of likes (or 'favourites') the tweet received.

    Distribution

    The dataset is typically provided in a CSV file format. It comprises 6,064 unique records, each representing a single tweet from Joe Biden's Twitter handle. The data spans from late 2007 to late 2020.

    Usage

    This dataset is ideal for various applications and use cases, including: * Political Analysis: Studying the evolution of Joe Biden's political messaging and public discourse. * Social Media Trend Analysis: Identifying patterns in social media engagement and public response to political figures. * Natural Language Processing (NLP): Training models for sentiment analysis, topic modelling, and text classification on political language. * Historical Research: Providing a digital archive for historians and researchers examining contemporary political communication. * Journalism: Fact-checking and providing context for news stories related to Joe Biden's past statements.

    Coverage

    The dataset's coverage is global, reflecting the worldwide accessibility of Twitter. The time range is precisely from 24th October, 2007, to 31st October, 2020. The content specifically focuses on tweets originating from Joe Biden's official Twitter handle, @JoeBiden, without specific demographic targeting or limitations beyond the nature of a public figure's Twitter feed.

    License

    CC0

    Who Can Use It

    • Academic Researchers: For studies on political science, communication, and digital humanities.
    • Data Scientists and Analysts: For developing and testing algorithms related to text analysis and social media metrics.
    • Journalists: For investigative reporting and historical context.
    • AI and LLM Developers: For creating and refining language models with real-world political discourse.
    • Students: As a practical resource for projects in data analysis, computer science, and political studies.

    Dataset Name Suggestions

    • Joe Biden Tweets Archive
    • Joe Biden's Official Twitter Data (2007-2020)
    • US Political Tweets: Joe Biden
    • Biden's Digital Footprint on Twitter
    • Joe Biden Social Media Activity Dataset

    Attributes

    Original Data Source: Joe Biden Tweets (2007 - 2020)

  10. m

    Mexican Political Twitter Dataset (2018 Presidential Election)

    • data.mendeley.com
    Updated Apr 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Carlos Piña (2025). Mexican Political Twitter Dataset (2018 Presidential Election) [Dataset]. http://doi.org/10.17632/j4pxzxpkc3.1
    Explore at:
    Dataset updated
    Apr 21, 2025
    Authors
    Carlos Piña
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Mexico
    Description

    This dataset contains Twitter data collected during the 2018 Mexican presidential election campaign, focusing on mentions and tweets related to the main presidential candidates (@JoseAMeadeK, @RicardoAnayaC, @lopezobrador_, and @JaimeRdzNL). It represents a sample of 10,000 tweets from a larger dataset gathered as part of the research project "In-Context Learning for Misinformation Detection: Detecting Political Propaganda on Twitter Mexico using Large Language Model Meta AI".

    The dataset includes the following fields: tweet_id: Unique identifier for each tweet followers_count: Number of followers of the user who posted the tweet created_at: Original timestamp of tweet creation (UTC) local_time: Timestamp converted to Mexico City time zone tweet: Text content of the tweet source: Platform or application used to post the tweet

    This sample dataset was collected using the Twitter's streaming API in 2018. The script filtered the global Twitter stream for mentions of Mexico's presidential candidates.

    Several fields present in the original data collection have been removed from this sample to comply with Twitter's terms of service and to protect user privacy:

    Username (screen_name) Tweet URL Geographical coordinates User location information

    Only publicly accessible tweets (those without privacy restrictions set by users) were collected in the original dataset.

    This dataset serves as a sample to provide insights into the larger research project focusing on misinformation detection and political propaganda analysis in Mexican social media during the 2018 presidential campaign. The research applies large language models to detect patterns of misinformation in political discourse.

  11. Twitter activity data for Twitter-based analysis of the dynamics of...

    • figshare.com
    application/gzip
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Young-ho Eom (2023). Twitter activity data for Twitter-based analysis of the dynamics of collective attention to political parties (PLoS ONE, 2015) [Dataset]. http://doi.org/10.6084/m9.figshare.1437740.v1
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Young-ho Eom
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data-set contain dynamics of Twitter activity on political parties. Please see the main paper "Twitter-based analysis of the dynamics of collective attention to political parties (PLoS ONE, 2015)" for the details.

  12. d

    Replication Data for: Violent Political Rhetoric on Twitter

    • search.dataone.org
    Updated Nov 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kim, Tae Gyoon (2023). Replication Data for: Violent Political Rhetoric on Twitter [Dataset]. http://doi.org/10.7910/DVN/NEC17Z
    Explore at:
    Dataset updated
    Nov 12, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Kim, Tae Gyoon
    Description

    copy directly from abstract in PSRM publication. Visit https://dataone.org/datasets/sha256%3Ab55eab8750f6f4b758b0db7bbe0297a5da783ad5f9d3d6c11e4670ee355d4ce7 for complete metadata about this dataset.

  13. f

    Data for right-wing groups found on Twitter around the 2016 US election

    • figshare.com
    txt
    Updated Oct 5, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Bryden; Eric Silverman (2018). Data for right-wing groups found on Twitter around the 2016 US election [Dataset]. http://doi.org/10.17637/rh.7160027.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Oct 5, 2018
    Dataset provided by
    Royal Holloway, University of London
    Authors
    John Bryden; Eric Silverman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States
    Description

    Anonymised raw data downloaded from Twitter. All twitter ids have been encrypted so will not work.The .tgz files each contain a .json file containing user account data. In the files there is a json line for each account, which includes the id, its creation date, a list of ids of accounts followed (followids), and a list of ids for accounts which follow that account (friendids).The file groupMembers.json has a json line for each group of accounts found. This includes the id for the group, and a list of the ids of its members.The file groupDescriptions.json has a json line for each group of accounts found. This includes the id for the group and a list of the unusual words found for each group.

  14. [Tweets] 2022 Brazilian Presidential Elections

    • zenodo.org
    zip
    Updated Feb 7, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lucas Raniére Juvino Santos; Lucas Raniére Juvino Santos; Leandro Balby Marinho; Leandro Balby Marinho; Claudio Campelo; Claudio Campelo (2025). [Tweets] 2022 Brazilian Presidential Elections [Dataset]. http://doi.org/10.5281/zenodo.14834669
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 7, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Lucas Raniére Juvino Santos; Lucas Raniére Juvino Santos; Leandro Balby Marinho; Leandro Balby Marinho; Claudio Campelo; Claudio Campelo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Aug 1, 2022
    Area covered
    Brazil
    Description

    2022 Brazilian Presidential Election

    This dataset contains 7,015,186 tweets from 951,602 users, extracted using 91 search terms over 36 days between August 1st and December 31st, 2022.

    All tweets in this dataset are in Brazilian Portuguese.

    Data Usage

    The dataset contains textual data from tweets, making it suitable for various NLP analyses, such as sentiment analysis, bias or stance detection, and toxic language detection. Additionally, users and tweets can be linked to create social graphs, enabling Social Network Analysis (SNA) to study polarization, communities, and other social dynamics.

    Extraction Method

    This data set was extracted using Twitter's (now X) official API—when Academic Research API access was still available—following the pipeline:

    1. Twitter/X daily monitoring: The dataset author monitored daily political events appearing in Brazil's Trending Topics. Twitter/X has an automated system for classifying trending terms. When a term was identified as political, it was stored along with its date for later use as a search query.

    2. Tweet collection using saved search terms: Once terms and their corresponding dates were recorded, tweets were extracted from 12:00 AM to 11:59 PM on the day the term entered the Trending Topics. A language filter was applied to select only tweets in Portuguese. The extraction was performed using the official Twitter/X API.

    3. Data storage: The extracted data was organized by day and search term. If the same search term appeared in Trending Topics on consecutive days, a separate file was stored for each respective day.

    Further Information

    For more details, visit:

    - The repository
    - Dataset short paper:

    ---

    DOI: 10.5281/zenodo.14834669
  15. d

    Data for : Reconstruction of the socio-semantic dynamics of political...

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gaumont, Noé; Panahi, Maziyar; Chavalarias, David (2023). Data for : Reconstruction of the socio-semantic dynamics of political activist Twitter networks - Method and application to the 2017 French presidential election [Dataset]. http://doi.org/10.7910/DVN/AOGUIA
    Explore at:
    Dataset updated
    Nov 22, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Gaumont, Noé; Panahi, Maziyar; Chavalarias, David
    Time period covered
    Jul 1, 2016 - May 8, 2017
    Description

    These are the data related to the PLOS ONE paper : Gaumont N, Panahi M, Chavalarias D (2018) Reconstruction of the socio-semantic dynamics of political activist Twitter networks—Method and application to the 2017 French presidential election. PLoS ONE 13(9): e0201879. https://doi.org/10.1371/journal.pone.0201879 This paper proposes an integrated methodology for the data collection, the reconstruction and the visualization of the development of a country political environment from Twitter data. These data cover several aspects of the analysis of the 2017 French presidential campaign election from the perspective of Twitter processing of the Twitter data: intermediary results processed on the tweets dataset (for example text-mining results), additional data from the candidates' programs. Additional information are given in the Supporting information texts.

  16. Political Tweets Dataset

    • brightdata.com
    .json, .csv, .xlsx
    Updated Dec 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2025). Political Tweets Dataset [Dataset]. https://brightdata.com/products/datasets/twitter/tweets/political
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Dec 23, 2024
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    Utilize our Political Tweets dataset to enhance campaign strategies and gain insights into public discourse. This dataset offers a comprehensive view of political dynamics on social media, empowering organizations, researchers, and policymakers to analyze trends and sentiment. Access the full dataset or customize it with specific data points tailored to your needs. Popular use cases include: Sentiment Analysis: Analyze publicly available political tweets to understand public sentiment on policies, events, and candidates, aiding campaign strategies and opinion research. Trend Monitoring: Track trending topics and hashtags in political discourse to identify key issues and shifts in public priorities across demographics. Misinformation Detection: Detect and analyze patterns of misinformation, supporting efforts to combat its spread effectively. Harness these insights to stay informed and adapt to the evolving political landscape.

  17. d

    Replication Data for: \"Most users do not engage with political elites on...

    • search.dataone.org
    Updated Nov 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Casas, Andreu (2023). Replication Data for: \"Most users do not engage with political elites on Twitter; Those who do, show overwhelming preferences for ideological congruity\" [Dataset]. http://doi.org/10.7910/DVN/G5APVR
    Explore at:
    Dataset updated
    Nov 8, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Casas, Andreu
    Description

    This repository contains the replication material of the article "Most users do not engage with political elites on Twitter; Those who do, show overwhelming preferences for ideological congruity", to be published at Sience Advances, by Magdalena Wojcieszak, Andreu Casas, Xudong Yu, Jonathan Nagler, and Joshua Tucker. One of the datasets is too large for Dataverse (ingroup-sharing-model-data.csv), you can find a copy in this Google Drive: https://drive.google.com/drive/folders/1EYqaSF-EukTGhanogqaevSnjn7l0koEH?usp=sharing. You can also clone and use this code/data via GitHub: https://github.com/CasAndreu/ingroup_filtering

  18. H

    Twitter data on political debates about the Italian immigration policies

    • dataverse.harvard.edu
    pdf, tsv
    Updated Jun 8, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harvard Dataverse (2020). Twitter data on political debates about the Italian immigration policies [Dataset]. http://doi.org/10.7910/DVN/CUWN54
    Explore at:
    tsv(115906251), pdf(20581)Available download formats
    Dataset updated
    Jun 8, 2020
    Dataset provided by
    Harvard Dataverse
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Statistics say that Twitter is the preferred social network by journalists in Europe. This means that it provides a perfect environment for the study on debates about Society, Economics, and Politics. The dataset depicting the retweeting activity on an Italian debate regarding immigration policies over a period of one month (from January 2019, 23rd to February 2019, 22nd). The dataset is labeled according to the boticity score of the users participating in the discussion, as outcome of Botometer, a popular bot detector. All the accounts have been classified either as human-operated or as bots. Due to Twitter developers terms we can only provide ids for users and tweets, that can be used to retrieve the original data through the Twitter API. For additional details please refer to "Twitter data on political debates about the Italian immigration policies". (currently under submission to CIKM 2020 resource papers)

  19. d

    Replication Data : Twitter and the Projection of Political Personalities in...

    • search.dataone.org
    Updated Sep 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jumle, Vihang; Karthik KR, Vignesh; Jumle, Vedant (2024). Replication Data : Twitter and the Projection of Political Personalities in India [Dataset]. http://doi.org/10.7910/DVN/ATGGJR
    Explore at:
    Dataset updated
    Sep 24, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Jumle, Vihang; Karthik KR, Vignesh; Jumle, Vedant
    Description
  20. H

    Twitter data on 2020 U.S. Presidential Election Debates

    • dataverse.harvard.edu
    Updated Oct 10, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Manuel Pratelli; Fabio Saracco; Marinella Petrocchi; Rocco De Nicola (2022). Twitter data on 2020 U.S. Presidential Election Debates [Dataset]. http://doi.org/10.7910/DVN/ANBPTC
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 10, 2022
    Dataset provided by
    Harvard Dataverse
    Authors
    Manuel Pratelli; Fabio Saracco; Marinella Petrocchi; Rocco De Nicola
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    United States
    Description

    The dataset contains Twitter traffic related to the 2020 US pre-election debate. The data was collected considering two types of states, namely swing and safe. The term 'swing' refers to those states in which one cannot be certain of a landslide victory for either Republicans or Democrats, as there is no clear historical orientation of the electorate. In contrast to swing, a state is defined as 'safe' if its citizens have traditionally always elected representatives of the same political party. In particular, the tweets present are from four swing states (i.e., Arizona, Florida, Michigan e Pennsylvania) and four safe states (i.e., New Jersey, Indiana, Washington and Louisiana). The dataset was used in the study "Swinging in the States: Does disinformation on Twitter mirror the US presidential election system?". (under submission).

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Kash (2022). Global Political tweets [Dataset]. https://www.kaggle.com/kaushiksuresh147/political-tweets/tasks
Organization logo

Global Political tweets

Tweets across the globe with trending #Politics hashtag

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 23, 2022
Dataset provided by
Kaggle
Authors
Kash
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

https://techcrunch.com/wp-content/uploads/2015/10/twitter-politics.png" alt="">

  • Social media is becoming a key medium through which we communicate with each other: it is at the center of the very structures of our daily interactions. Yet this infiltration is not unique to interpersonal relations. Political leaders, governments, and states operate within this social media environment, wherein they continually address crises and institute damage control through platforms such as Twitter.

  • With the proliferation of the internet into mass masses, social media is emerging as a potential way of communication. It provides a direct channel to politicians for communicating, connecting, and engaging with the public. The power of social media, especially Twitter and Facebook has been proved by its successful application during recent US presidential elections and Arabian countries' revolts. In India too, as the general election is about to knock at the door during early 2014, political parties and leaders are trying to harness the power of social media.

Content

The tweets have the #Politics hashtag. The collection started on 24/7/2021, and will be updated on a daily basis.

Information regarding the data

The data totally consists of 1 lakh+ records with 13 columns. The description of the features is given below | No |Columns | Descriptions | | -- | -- | -- | | 1 | user_name | The name of the user, as they’ve defined it. | | 2 | user_location | The user-defined location for this account’s profile. | | 3 | user_description | The user-defined UTF-8 string describing their account. | | 4 | user_created | Time and date, when the account was created. | | 5 | user_followers | The number of followers an account currently has. | | 6 | user_friends | The number of friends an account currently has. | | 7 | user_favourites | The number of favorites an account currently has | | 8 | user_verified | When true, indicates that the user has a verified account | | 9 | date | UTC time and date when the Tweet was created | | 10 | text | The actual UTF-8 text of the Tweet | | 11 | hashtags | All the other hashtags posted in the tweet along with #Politics | | 12 | source | Utility used to post the Tweet, Tweets from the Twitter website have a source value - web | | 13 | is_retweet | Indicates whether this Tweet has been Retweeted by the authenticating user. |

Inspiration

You can use this data to dive into the subjects that use this hashtag, look to the geographical distribution, evaluate sentiments, and look at trends.

Search
Clear search
Close search
Google apps
Main menu