Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
https://techcrunch.com/wp-content/uploads/2015/10/twitter-politics.png" alt="">
Social media is becoming a key medium through which we communicate with each other: it is at the center of the very structures of our daily interactions. Yet this infiltration is not unique to interpersonal relations. Political leaders, governments, and states operate within this social media environment, wherein they continually address crises and institute damage control through platforms such as Twitter.
With the proliferation of the internet into mass masses, social media is emerging as a potential way of communication. It provides a direct channel to politicians for communicating, connecting, and engaging with the public. The power of social media, especially Twitter and Facebook has been proved by its successful application during recent US presidential elections and Arabian countries' revolts. In India too, as the general election is about to knock at the door during early 2014, political parties and leaders are trying to harness the power of social media.
The tweets have the #Politics hashtag. The collection started on 24/7/2021, and will be updated on a daily basis.
The data totally consists of 1 lakh+ records with 13 columns. The description of the features is given below | No |Columns | Descriptions | | -- | -- | -- | | 1 | user_name | The name of the user, as they’ve defined it. | | 2 | user_location | The user-defined location for this account’s profile. | | 3 | user_description | The user-defined UTF-8 string describing their account. | | 4 | user_created | Time and date, when the account was created. | | 5 | user_followers | The number of followers an account currently has. | | 6 | user_friends | The number of friends an account currently has. | | 7 | user_favourites | The number of favorites an account currently has | | 8 | user_verified | When true, indicates that the user has a verified account | | 9 | date | UTC time and date when the Tweet was created | | 10 | text | The actual UTF-8 text of the Tweet | | 11 | hashtags | All the other hashtags posted in the tweet along with #Politics | | 12 | source | Utility used to post the Tweet, Tweets from the Twitter website have a source value - web | | 13 | is_retweet | Indicates whether this Tweet has been Retweeted by the authenticating user. |
You can use this data to dive into the subjects that use this hashtag, look to the geographical distribution, evaluate sentiments, and look at trends.
Facebook
TwitterBy CrowdFlower [source]
This dataset provides an insightful look at thousands of social media messages from US Senators and other American politicians. Contributors studied their content to classify the messages according to audience - either national or constituent - bias, as well as its actual substance (informational, announcement of a media appearance, an attack on another candidate, etc). This dataset is a valuable tool for uncovering insight into the political dynamics in modern America by exploring topics such as partisan and neutral/bipartisan message leanings among different audiences. With 5000 rows of data points from various sources at your fingertips, you can get definitive answers about what type of reactions different types of political messages produce. Get ready for a never before seen view on the world
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset can be used to uncover political bias in social media communication among US Senators and other American politicians. The data contains 5000 social media messages, which have been classified into audience, bias and message content.
To get started with the dataset, it is important to familiarize yourself with the columns of data that can be found in the dataset. These columns include: - ‘_golden’ (Boolean): Indicates whether the row is a golden row or not - ‘_unit_state’ (String): State of the unit - ‘_trusted_judgements’ (Integer): Number of trusted judgments for this row - ‘_last_judgement’ (Timestamp): When was last judgment was made for this post? - ‘audience’ (String): Who is the intended audience for this post? National or constituency? - 'bias' (String): What type of bias does this message convey? Neutral/bipartisan or biased/partisan?
- 'orig_golden' (Boolean): Indicates whether the row is a golden row or not
- 'audience_gold'(String) : Audience type which has been set as gold standard
'bias gold'(String) Bias type which has been set as gold standard
embed(String) : Embed code of post
label( String ) : Label assigned to post based on its sentiment
message gold( String ) : Message content which has been set as golden value
Source( String ) : Source from where information originated from
text( String ) : Text used in message by author
Once you are familiar with these columns, you will then want to explore different ways in which you can analyze and utilize your data. For example, you may want to create visualizations such as heat maps that show partisan and bipartisan messages across various geographies or states, analyze usage patterns by time or day of week etc., chart changes in message tone over time at specific accounts etc. You may also want to look at trends by political parties to see if some topics are more popular than others among certain groups. And finally utilize topic modelling techniques such as LDA model etc., to determine key topics present across multiple tweets added between different accounts and analyze each group's opinion specifically on those topics
- Developing an automated classification system using machine learning algorithms to accurately classifying audience, bias, and message content of social media messages.
- Identifying topics or issues that are being most talked/discussed on political social media accounts by tracking the messages with certain tags/label over time through data analysis.
- Clustering tweets based on different characteristics like user behavior and sentiment to better understand how people interact with politically-charged content on social media platforms and draw insights into the evolution of public opinion around important topics such as elections or laws changes
If you use this dataset in your research, please credit the original authors. Data Source
Unknown License - Please check the dataset description for more information.
File: Political-media-DFE.csv | Column name | Description | |:-----------------------|:---------------------------------------------------------------------------------------------------------------------------------------------...
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
~This dataset contains 40 social media posts collected from multiple platforms (Twitter, Facebook, Instagram, YouTube, TikTok). It provides a detailed view of how different types of content perform, how users engage with them, and how moderation systems respond.
~**Platform & Content:** Includes post type (Tweet, Story, Video, etc.), unique IDs, and timestamps.
~**User Information:** Follower counts and verification status.
~**Content Metadata:** Text, category, language, country, length, media type, and presence of external links.
~**Engagement Metrics:** Like, share, and comment counts, along with an overall engagement score.
~Misinformation Flag
~Fact-Check Source
~Moderation Action (e.g., Approved, Warning Label, Demonetized, Removed)
~Sentiment Score (positive/negative tone)
~Toxicity Score (harassment/offensive likelihood)
~Political Leaning (Neutral, Liberal, Conservative, Conspiracy)
~Topic Tags (e.g., climate, vaccine, election, 5G)
~Virality Indicators: Viral score estimating likelihood of content going viral.
~**Fake News & Misinformation Research** – Train ML models to detect misinformation.
~**Content Moderation Systems** – Study how platforms label, remove, or demonetize harmful content.
~**NLP & Sentiment Analysis** – Analyze toxicity, bias, and sentiment across platforms.
~**Trend Analysis** – Compare engagement across topics (climate change, vaccines, elections, 5G).
~**Political Bias Detection** – Explore correlations between political leaning, engagement, and moderation.
~40 posts
~25 features
~This dataset is a synthetic but realistic representation of social media activity. It can be useful for machine learning, data analysis, and visualization projects related to misinformation, user engagement, and platform moderation.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset provides a comprehensive collection of public sentiment and discourse related to Indian politics. The entries cover a wide range of opinions, news, social media posts, and other forms of public communication. Each entry is meticulously labeled with a sentiment score, capturing the polarity of the opinion from strongly negative to strongly positive.
This dataset is structured to facilitate detailed sentiment analysis and examination of political sentiments in India.
Use Cases
This dataset is ideal for:
Sentiment Analysis: Researchers can use this dataset to train and evaluate sentiment analysis models specifically tailored to the political context in India. Trend Analysis: Analysts can track the evolution of public sentiment over time, identifying key events that influenced public opinion. Political Studies: Scholars can investigate the relationship between public sentiment and political events, figures, and policies in India. Natural Language Processing (NLP): NLP practitioners can leverage this dataset for various tasks such as text classification, opinion mining, and more.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains 85,154 Twitter posts related to Indian political discourse, collected between October 2022 and March 2023. It includes tweet text, user identifiers, temporal metadata, and engagement metrics such as likes and retweets, enabling analysis of interaction patterns and engagement behavior in high-activity public discussions.
The dataset consists of 9 variables and is fully cleaned, with no missing values, duplicate records, or invalid timestamps. Derived temporal features (Year, Month, Day) are perfectly consistent with the original timestamp, ensuring reliability for time-based analysis.
Multiple forensic checks were performed to evaluate whether engagement metrics reflect real-world social media behavior:
Engagement metrics display realistic long-tail distributions, with a small fraction of highly engaged tweets and minimal zero inflation. The dataset contains over 58,000 distinct users and more than 98% unique tweet content, further supporting data authenticity.
In addition to tweet-level data, the dataset includes a user interaction network represented as directed edges. Each edge denotes an interaction between two users, derived from observable Twitter actions such as replies, mentions, or retweets. This network structure enables graph-based analysis of information flow, influence patterns, and community behavior within political discussions.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F22766094%2Fea12c83af697260cda6a5dfcdd6c544b%2Fgraph.jpg?generation=1769145568146421&alt=media" alt="">
The dataset is suitable for machine learning and analytical tasks such as engagement prediction, content analysis, user behavior modeling, and temporal interaction studies. Political content is treated solely as a high-engagement discussion domain and does not imply ideological inference or endorsement.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset, from Crowdflower's Data For Everyone Library, provides text of 5000 messages from politicians' social media accounts, along with human judgments about the purpose, partisanship, and audience of the messages.
Contributors looked at thousands of social media messages from US Senators and other American politicians to classify their content. Messages were broken down into audience (national or the tweeter’s constituency), bias (neutral/bipartisan, or biased/partisan), and finally tagged as the actual substance of the message itself (options ranged from informational, announcement of a media appearance, an attack on another candidate, etc.)
Data was provided by the Data For Everyone Library on Crowdflower.
Our Data for Everyone library is a collection of our favorite open data jobs that have come through our platform. They're available free of charge for the community, forever.
Here are a couple of questions you can explore with this dataset:
The dataset contains one file, with the following fields:
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The US Election 2024 Social Media Sentiment Dataset captures 100 authentic, anonymized posts from X (formerly Twitter) collected during November 5-6, 2024, coinciding with the US Presidential Election's critical period. This dataset reflects real-time public opinions, emotions, and discussions surrounding the election, focusing on candidates (Donald Trump, Kamala Harris), voting processes, and media narratives. Sourced via X's official API, the data ensures compliance with platform policies and prioritizes ethical considerations by anonymizing user identities.
Size: 100 unique posts (excluding replies and quoted posts to avoid redundancy).
Attributes:
Posts were collected using X's API with targeted queries (e.g., "#USElection2024", "Trump", "Harris" -filter:replies) and a minimum engagement filter (min_faves:1) to ensure relevance. The dataset was cleaned to remove sensitive information (e.g., full URLs where non-essential) while retaining original text for analysis. The collection focused on the latest posts to capture real-time reactions.
This dataset is a valuable resource for data scientists, political researchers, and students studying social media’s impact on the 2024 US Presidential Election. It provides a snapshot of public discourse, ideal for NLP, social network analysis, and trend detection.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset consists of 50,034 Twitter posts curated for the task of political vs non-political tweet classification. Each record contains two columns: the raw tweet text and a binary label indicating whether the tweet is political in nature.
tweet – textual content of the tweet type – class label (1 = political, 0 = non-political)The dataset exhibits a moderate class imbalance, reflecting real-world Twitter discourse:
type = 1): 27,459type = 0): 22,710The following validation steps were performed:
These checks confirm that the dataset is clean and suitable for supervised learning.
The dataset includes a wide variety of tweet styles, such as political statements, opinions, hashtags, mentions, and informal language. Both English and Hindi (including code-mixed usage) are present, closely resembling real-world Twitter data.
Political tweets typically reference:
- Political leaders
- Government actions and policies
- Elections and national issues
Non-political tweets typically include:
- Casual conversations
- General information or personal content
- Hashtags without political context
After comprehensive cleaning and validation, the dataset is:
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains 1,500 of political misinformation analyzed for the presence and effect of metaphor types on belief formation and emotional response. The data were collected across various media sources including social media posts, political advertisements, and news articles. Each record includes information on metaphor type (e.g., fear-based, nationalistic, artistic, cognitive), the political topic, emotional tone, and user engagement metrics. The dataset is complemented by responses from 547 participants, providing scores for belief acceptance and emotional intensity.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
By Twitter [source]
At the heart of understanding Joe Biden's successful election campaign were his effective and engaged use of social media. This dataset provides unparalleled insights into how Biden harnessed the power of Twitter to create engaging conversations, share his views on policy issues, and build positive relationships with his followers. Researchers can use this data to observe the likes, retweets, shares, and replies that Biden's posts generated over time to better understand how he connected with people. Explore this dataset to track hourly, daily and weekly activity in order to gain unique insights into how Joe Biden crafted his message using social media platforms. Analyze outlinks for discussion topics relevant for elections or even pull quoted tweets from Twitter users who engage in conversations with him. You'll be able to see first -hand just how influential Joe Biden was with regards to engaging in meaningful dialogue with individuals across America while gaining valuable insight into the powerful impact that digital communication had on this particular political race
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset offers researchers, journalists and political analysts a comprehensive understanding of how former Vice President Joe Biden’s social media activity provides insight into his views and opinions on policy, foreign relationships and election dynamics.
Through this dataset, users can identify trends in the number of likes, retweets and replies that are generated by the posts from Joe Biden’s Twitter account. Along with this data users can also observe changes in the quoted Tweets, outlinks mentioned in posts as well as the URLs associated with them.
To make full use of this dataset follow these steps: 1. Begin by exploring the key columns such as content (tweet text), created_at (date/time posted), likeCount (number of likes on tweet), retweetCount (number of retweets on tweet) and replyCount (number of replies to tweet).
2. Using analytical tools explore correlations between variables such as between created_at column and other columns like quoteCount or outlinks to see if certain insights can be drawn depending upon when the post is made or not made by Joe Biden himself or a campaign staff member against variables like type & length of post, medium used etc..
3. Explore which tweets have more reach with higher engagement rates within lesser time frames using variables like retweetedTweet & quotedTweet along side other fields for more interesting insights about what kind messages work better than others for specific times & situations during campaigns. 4. Engage further with observed patterns to identify further links leading to interesting conclusions about outreach related activity during campaigning periods using analysis methods like data visualisations across time lines linking multiple tweets together + finding geographic regions where Joe Biden has most followers etc..
Finally never forget that proper application (& comparison) through hypothesis testing is essential when dealing with large datasets while correlating facts across multiple channels - especially dealing with topics related to politics involving a public figure being analyzed through their own tweets!
- Analyzing the sentiment of Joe Biden's tweet text and how it changes over time.
- Tracking engagement with different topics to understand which issues are most important to him and his followers.
- Comparing tweet engagement dynamics between Joe Biden and other prominent political figures for research comparison studies
If you use this dataset in your research, please credit the original authors. Data Source
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
File: JoeBiden.csv | Column name | Description | |:-------------------|:-----------------------------------------------------------------------------------------------------------------------| ...
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The Freedmen's Bureau, formally known as the Bureau of Refugees, Freedmen, and Abandoned Lands, was established in 1865 by the U.S. Congress to aid formerly enslaved people in the South during the Reconstruction era following the Civil War. The Bureau's responsibilities included providing food, housing, education, and medical care. Furthermore, it helped formerly enslaved individuals legalize marriages, pursue employment, locate lost family members, and establish schools.
The documents in the dataset represent a range of records created or managed by the Freedmen's Bureau, reflecting its diverse functions and role in the Reconstruction era.
These records provide valuable insights into the social, economic, and political conditions during this transformative era in American history.
Link to Website: Visit the Freedmen's Bureau Online Archive at https://freedmensbureau.info/
We are excited to share this rich dataset of historical documents with the Kaggle community and beyond. This collection offers a unique window into a pivotal era, brimming with stories waiting to be discovered and analyzed. Your expertise and curiosity can help unearth new insights and deepen our collective understanding of this Post-Civil War period.
Diverse Perspectives: Each researcher brings a unique perspective to the table. By analyzing this data, you can contribute to a more comprehensive and nuanced understanding of history.
Innovative Analysis: Whether you are a seasoned data scientist, a student of history, a language enthusiast, or someone with a passion for uncovering the past, your analysis can reveal trends, patterns, and stories that might otherwise remain hidden.
Collaborative Discovery: Share your findings with the community. Engage in discussions, compare results, and collaborate to build a richer narrative.
Conduct Analysis: Use tools in NLP, data visualization, or statistical analysis to explore the dataset.
Share Insights: Publish your findings on Kaggle, in academic journals, or through social media. Engage with others' work and offer constructive feedback.
Build Projects: Employ the dataset as a basis for research projects, educational materials, or innovative applications.
Download the dataset and start exploring. Share any interesting patterns, anomalies, or insights you discover. If you’re new to NLP or data analysis, seize this opportunity to learn and grow. A supportive community awaits you here.
Together, let's illuminate the past to inform our present and future. We can't wait to see the incredible work you'll do with this dataset!
Contracts
Focused on contracts, this sheet includes agreements related to labor, apprenticeships, and other binding agreements from the Reconstruction era. This is indicated by subcategory entries such as "Apprenticeship Agreement".
Court Records
This section comprises court records, including arrest reports as seen in subcategory. It offers a glimpse into the legal proceedings and judicial matters handled by the Bureau.
Education Records
This sheet includes documents related to education, encompassing school establishment documents, expenses, and other educational matters. These records offer insights into the efforts to educate and uplift newly freed individuals.
Financial Records
In this sheet, financial records range from budgets to expense reports. The subcategory differentiates between general financial records and specific types like cover pages of reports.
Letters & Reports
This sheet contains transcriptions of letters and reports from the Freedman's Bureau.
Personnel Records
This section contains documents related to personnel, potentially including staff or individuals associated with the Freedmen's Bureau. It may include employment records, duty rosters, or personnel reports.
Property Records
This sheet focuses on property-related records, possibly including bonds, contracts, or ownership documents. The sub_category field differentiates between types of property records (e.g., "Bond", "Cover Page"), while the transcription_text provides detailed content.
Rations Records
This sheet details the distribution or requests for food rations. It may include appeals like the provided example, where individuals or families seek assistance. The columns follow the same structure, offering direct insights into the socio-economic conditions of the Post-Civil War era.
Transportation Records
Focused on transportation-related documents, this sheet contains records about the movement of goods and people. The ...
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
This dataset contains posts and interactions from Donald J. Trump's Truth Social account, specifically during his 2024 U.S. Presidential election campaign. Each post entry provides detailed information, including the post content, number of replies, shares, likes, and metadata such as post date, media URLs (if available), and account details. The data offers a rich source for analyzing political messaging, engagement metrics, and audience reactions during the campaign period.
The posts are sourced directly from Trump's official Truth Social profile, capturing interactions that are publicly available.
The dataset may not include every post or interaction due to scraping limitations, and some interactions might lack context or additional details that could affect interpretability.
This dataset is intended for research and analysis purposes. Please ensure that any use of the data complies with Truth Social's terms of service and applicable copyright laws.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
To use this dataset on your research paper use the following reference.
@artical{s13102024ijcatr13101005,
Title = "Comparing Political Inclination Classification on Twitter Posts using Naive Bayes, SVM, and XGBoost",
Journal ="International Journal of Computer Applications Technology and Research(IJCATR)",
Volume = "13",
Issue ="10",
Pages ="62 - 65",
Year = "2024",
Authors ="Shashank Shree Neupane, Atish Shakya, Bishan Rokka, Sagar Acharya"}
The details of the article is:
International Journal of Computer Applications Technology and Research Volume 13–Issue 10, 62 – 65, 2024, ISSN:-2319–8656 DOI:10.7753/IJCATR1310.1005
The link to article: https://ijcat.com/archieve/volume13/issue10/ijcatr13101005
The dataset contains the twitter post of nepali political leader who are on political parties. The dataset can be used to know the inclination of people towards a political party with their post on the social media such as X (formerly twitter).
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Techsalerator’s Location Sentiment Data for the United States of America
Techsalerator’s Location Sentiment Data for the United States offers a comprehensive dataset crucial for businesses, researchers, and technology developers. This dataset provides deep insights into location-based sentiment patterns, helping users understand regional and local variations in public opinion across different areas in the U.S.
For access to the full dataset, contact us at info@techsalerator.com or visit Techsalerator Contact Us.
Techsalerator’s Location Sentiment Data for the United States provides structured sentiment analysis across urban, suburban, and rural areas. This dataset is essential for AI development, market research, political analysis, and social studies.
To obtain Techsalerator’s Location Sentiment Data for the United States, contact info@techsalerator.com with your specific requirements. Techsalerator provides customized datasets based on requested fields, with delivery available within 24 hours. Ongoing access options can also be discussed.
For detailed insights into location-based sentiment patterns across the United States, Techsalerator’s dataset is an invaluable resource for researchers, marketers, political analysts, and urban planners.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
By Reddit [source]
Immerse yourself in the world of Reddit's Subreddit WTF with this comprehensive dataset! This dataset offers a unique glimpse into the mysterious world of Reddit, giving researchers access to 8 columns packed with essential data such as titles, scores, comments counts and URLs, creation dates and times as well as post bodies and timestamps. With this data in hand, researchers can begin uncovering secrets buried beneath Reddit's user-generated content. Uncover trends that go beyond simple sentiment analysis or explore topics important to the community! The possibilities available for research are abundant; it's time to start uncovering the answers you need from this exciting data set
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
Using this dataset, you can uncover previously unseen insights on Reddit's Subreddit WTF. To get the most out of this dataset, you should familiarize yourself with the columns: title, score, url, comms_num (comment numbers), created (creation date and time), body (the posts body) and timestamp. Each column is filled with valuable information that you can use to uncover hidden trends and topics of discussion within Reddit’s Subreddit WTF community.
Start by exploring the data to gain an understanding of how Reddit works. Look for patterns such as hot topics being discussed or particular types of posts being upvoted more frequently than others. Understanding these patterns will help you build a better picture of what’s going on in Reddit’s Subreddit WTF community. Once you understand the data, start mining it for relevant metadata—such as sentiment analysis or keyword searches—that will add deeper insight into its user-generated content. By answering questions such as “what topics are people discussing?” or “what type of posts are generating more buzz?” Researchers have the capability to dive deep into related topics with much accuracy thanks to this comprehensive Dataset!
Depending on your research needs; it might also be worth combining different columns from the dataset like URL + Title/Body = something contextual - to unlock insights that help answer a bigger research question when investigated together multiple pieces of data together .This method might be key in uncovering hidden gems from within this dataset! For example; if studying political impact on users content - then analyzing post bodies combined with their dates & timestamps could reveal some interesting trends about increase/decrease of online activity in relation political events happening around a certain period in times cases).
- Analyzing the Sentiment of Users: By studying the language and body of Reddit posts, researchers can analyze sentiment and uncover any potential biases or patterns in user words. This could be used to infer subtle changes in sentiment or overall sentiment at a given point in time, as well as spot emerging topics or controversies.
- Uncover Popular Topics: Through analyzing the titles, topics discussed on WTF subreddit could be determined, allowing for insight into what content might be popular on Reddit overall. This could also help reveal insight into what type of content is widely accepted by Reddit audiences and'smaller subcultures.'
- Tracking User Engagement: By studying scores and comment counts over time, researcherse can track user engagement with posts over time to see when users are most likely to comment on posts or interact with one another. This could help shed light on user habits and preferences when it comes to engaging with content across different platforms like Reddit or other social media sites
If you use this dataset in your research, please credit the original authors. Data Source
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
File: WTF.csv | Column name | Description | |:--------------|:--------------------------------------------------------| | title | The title of the post. (String) | | score | The number of upvotes the post has received. (Integer) | | url | The URL of the post. (String) ...
Facebook
TwitterTigerDroppings.com, founded in 2001 by LSU alumnus Brian Fiegel, is among the most notable and active of any college sports forums on the internet, and its popularity hasn’t declined even as major social media platforms have come to dominate online spaces for discussion. The site’s userbase consists primarily of Louisiana residents and LSU graduates, though fans of other schools in the Southeastern Conference also frequent the site. The users on these sites fit a very specific demographic and they have little diversity. In a survey of users from numerous college sports forums, including TigerDroppings, it was found that 87% of users were male and 90% were white. Additionally, 76% had at least an undergraduate degree and 42% of users had a household income of $100,000 or greater. In November 2015, TigerDroppings had 129,244 users and now, seven years later, has 256,692 registered users; the site is still growing as fast as it did in the 2000s. There is a reason for this — these very specific demographics of the userbase are able to communicate in a way they otherwise couldn’t on Facebook or Twitter. Given these demographics, the forum takes on an overwhelmingly conservative tone in the opinions and sentiments regularly expressed. To put it simply, I couldn’t imagine a dataset that better encapsulates the psyche and mindset of white conservative men in Louisiana. Comprising almost 14 million political posts from 2014 to the present, it profiles the rise of Trumpism and the cataclysmic shifts seen in American politics in recent years.
Early in the site’s history, their off-topic board the “OT Lounge” was created and is the most popular board on the site, followed closely by their “Politics” board. Unlike many other similar forums, TigerDroppings relies solely on advertising to generate revenue, and all boards are free to view and create posts on. Only an email is required to sign up and all posts are anonymous; users are only outwardly identifiable by their chosen screennames. The functionality of the site has largely gone unchanged since its founding. Users can start a thread on a particular board and replies by other users are appended to the thread; there is no visible hierarchy to replies on threads, unlike platforms like Reddit, and it is very rudimentary by current standards. On every single reply in a thread there is an upvote and a downvote button; next to each button, their respective values are displayed, publicly showing the popularity of a user’s post. Users have been informally voting on political opinions and sentiments constantly, which I believe is rich for analyzing the rise of specific attitudes and rhetoric used among this demographic.
Attached to each post in the dataset are several pieces of metadata: upvotes, downvotes, username, date of post, date of thread creation, URLs from links contained in the post, URLs to images in the post, text from blockquotes, and the position of the post in its respective thread. Additionally, I was able to gather emails and phone numbers for approx. 3,000 users of the site through the Ticket Marketplace Board, as many users had posted contact info to interact with other users externally. Data from the OT-Lounge was able to be scrapped in its entirety from 2014 to present, though among data from the Politics Board there were some gaps. All data from 2015 was not publicly accessible for unclear reasons. But more interestingly, all threads from November 2, 2020 to January 7, 2021 — the day before the presidential election until the day after the insurrection at the U.S. Capitol — is not publicly accessible at all. I hypothesize there was significant activity talking about election fraud during that period, along with potentially incriminating information about posters who may have participated in the events on January 6th.
In terms of where I want to go from here with this dataset, I am interested in exploring if a model to isolate and predict political trends among this demographic could be feasible, along with exploring what potential uses it has as a tool for electoral politics in Louisiana. If anything, I want to do some anthropological research about this demographic that so clearly to me describes the social, cultural, and political environment I was raised in.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset contains 8,892 Reddit posts and 361,595 comments (including nested replies) from multiple India-focused subreddits. It captures discussions spanning news, politics, technology, entertainment, sports, and memes, providing rich data for text analysis, sentiment research, trend detection, and social media studies.
subreddit: Name of the subreddit post_id: Unique identifier for the post post_title: Title of the post post_text: Full text of the post (if available) url: Link to the Reddit post comments: Nested JSON array of comments, each containing:
author: Comment author body: Comment text replies: Nested array of replies with the same structure "india", "Indianews", "IndiaPolitics", "IndianPolitics", "AskIndia", "IndianStartups", "LegalAdviceIndia", "IndiaInvestments", "IndianGaming", "IndianGamers", "AndroidGaming", "IndiaTech", "TechNewsIndia", "EsportsIndia", "bollywood", "BollyBlindsNGossip", "BollywoodMusic", "Desihiphop", "NetflixBestOf", "DisneyPlus", "MoviesIndia", "IndianTV", "Cricket", "IndianFootball", "SoccerIndia", "DesiMemeTemplates", "IndianDankMemes", "IndianDankTemplates", "MemeTemplatesOfficial", "dankmemes", "MusicIndia", "indieheads", "popheads", "hiphopheads", "Metalcore", "Metal", "MusicNews", "AppleIndia", "ArtificialIntelligenceIndia", "ChatGPTIndia"
Data was collected from publicly available Reddit posts and comments using scraping methods (Reddit API). Only publicly accessible posts are included; no private content was collected.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Techsalerator’s Location Sentiment Data for Sri Lanka
Techsalerator’s Location Sentiment Data for Sri Lanka provides valuable insights into public sentiment across various regions of the country. This dataset is essential for businesses, researchers, and policymakers aiming to understand regional emotions, opinions, and attitudes. By analyzing location-based sentiment trends, organizations can enhance decision-making, marketing strategies, and customer engagement.
For access to the full dataset, contact us at info@techsalerator.com or visit Techsalerator Contact Us.
To obtain Techsalerator’s Location Sentiment Data for Sri Lanka, contact info@techsalerator.com with your specific requirements. Techsalerator offers customized datasets based on requested fields, with delivery available within 24 hours. Ongoing access options can also be discussed.
For comprehensive insights into public sentiment and regional opinions across Sri Lanka, Techsalerator’s dataset is an essential tool for businesses, policymakers, and researchers.
Facebook
TwitterThis dataset captures election-related discussions on TruthSocial in the lead-up to the 2024 U.S. presidential election. With 1.5 million posts spanning from February 2022 to October 2024, this dataset provides insights into political discourse, community formation, and the spread of information on a prominent alt-tech platform.
TruthSocial, a platform focused on free speech, has attracted users with diverse political views, often leaning conservative. This dataset is ideal for researchers, data scientists, and political analysts interested in studying communication patterns, engagement trends, and sentiment on election-related topics in a less-moderated social media environment.
This dataset can be utilized for:
This dataset is intended for research purposes and should be cited appropriately if used in published work.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by Anjay23
Released under CC0: Public Domain
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
https://techcrunch.com/wp-content/uploads/2015/10/twitter-politics.png" alt="">
Social media is becoming a key medium through which we communicate with each other: it is at the center of the very structures of our daily interactions. Yet this infiltration is not unique to interpersonal relations. Political leaders, governments, and states operate within this social media environment, wherein they continually address crises and institute damage control through platforms such as Twitter.
With the proliferation of the internet into mass masses, social media is emerging as a potential way of communication. It provides a direct channel to politicians for communicating, connecting, and engaging with the public. The power of social media, especially Twitter and Facebook has been proved by its successful application during recent US presidential elections and Arabian countries' revolts. In India too, as the general election is about to knock at the door during early 2014, political parties and leaders are trying to harness the power of social media.
The tweets have the #Politics hashtag. The collection started on 24/7/2021, and will be updated on a daily basis.
The data totally consists of 1 lakh+ records with 13 columns. The description of the features is given below | No |Columns | Descriptions | | -- | -- | -- | | 1 | user_name | The name of the user, as they’ve defined it. | | 2 | user_location | The user-defined location for this account’s profile. | | 3 | user_description | The user-defined UTF-8 string describing their account. | | 4 | user_created | Time and date, when the account was created. | | 5 | user_followers | The number of followers an account currently has. | | 6 | user_friends | The number of friends an account currently has. | | 7 | user_favourites | The number of favorites an account currently has | | 8 | user_verified | When true, indicates that the user has a verified account | | 9 | date | UTC time and date when the Tweet was created | | 10 | text | The actual UTF-8 text of the Tweet | | 11 | hashtags | All the other hashtags posted in the tweet along with #Politics | | 12 | source | Utility used to post the Tweet, Tweets from the Twitter website have a source value - web | | 13 | is_retweet | Indicates whether this Tweet has been Retweeted by the authenticating user. |
You can use this data to dive into the subjects that use this hashtag, look to the geographical distribution, evaluate sentiments, and look at trends.