We collected data about Facebook pages (November 2017). These datasets represent blue verified Facebook page networks of different categories. Nodes represent the pages and edges are mutual likes among them. We reindexed the nodes in order to achieve a certain level of anonimity. The csv files contain the edges -- nodes are indexed from 0. We included 8 different distinct types of pages. These are listed below. For each dataset we listed the number of nodes an edges.
https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
171 million names (100 million unique) This torrent contains: The URL of every searchable Facebook user s profile The name of every searchable Facebook user, both unique and by count (perfect for post-processing, datamining, etc) Processed lists, including first names with count, last names with count, potential usernames with count, etc The programs I used to generate everything So, there you have it: lots of awesome data from Facebook. Now, I just have to find one more problem with Facebook so I can write "Revenge of the Facebook Snatchers" and complete the trilogy. Any suggestions? >:-) Limitations So far, I have only indexed the searchable users, not their friends. Getting their friends will be significantly more data to process, and I don t have those capabilities right now. I d like to tackle that in the future, though, so if anybody has any bandwidth they d like to donate, all I need is an ssh account and Nmap installed. An additional limitation is that these are on
Context Collection of Facebook spam-legit profile and content-based data. It can be used for classification tasks.
Content The dataset can be used for building machine learning models. To collect the dataset, Facebook API and Facebook Graph API are used and the data is collected from public profiles. There are 500 legit profiles and 100 spam profiles. The list of features is as follows with Label (0-legit, 1-spam). 1. Number of friends 2. Number of followings 3. Number of Community 4. The age of the user account (in days) 5. Total number of posts shared 6. Total number of URLs shared 7. Total number of photos/videos shared 8. Fraction of the posts containing URLs 9. Fraction of the posts containing photos/videos 10. Average number of comments per post 11. Average number of likes per post 12. Average number of tags in a post (Rate of tagging) 13. Average number of hashtags present in a post
Inspiration Dataset helps the community to understand how features can help to differ Facebook legit users from spam users.
This table includes platform data for Facebook participants in the Deactivation experiment. Each row of the dataset corresponds to data from a participant’s Facebook user account. Each column contains a value, or set of values, that aggregates log data for this specific participant over a certain period of time.
Losing access to your Facebook account can be a stressful experience, especially if it's your primary social media platform for connecting with friends, family, or business contacts facebook phone number +1 802 487 8095 . Whether your account was hacked, disabled, or you simply forgot your login credentials, there are multiple ways to contact Facebook and attempt to recover your account. This comprehensive guide will walk you through the process of recovering your Facebook account phone number +1 802 487 8095 , including how to use recovery tools, what to do if your account is hacked or disabled, and 1. Common Reasons for Losing Access to a Facebook Account Before initiating the recovery process, it’s important to identify why you lost access to your account. The reason affects how you approach Facebook: Forgotten password or email Lost access to the phone number +1 802 487 8095 or email linked to the account Hacked or compromised Account disabled by Facebook for violating terms Suspicious activity detected Fake identity report Name policy violations Each scenario has a different recovery method, and Facebook has dedicated tools and forms for each one. 2. First Steps Before Contacting Facebook
Before you attempt to reach Facebook’s support directly phone number +1 802 487 8095 , try these general steps: Use a known device and IP address – Access Facebook from a browser or app you’ve used before. Clear cache and cookies if logging in on a web browser. Check if your account is still visible by searching for your name from another Facebook profile. Try logging in with alternate emails or phone numbers associated with the account. If these don’t work, proceed with specific recovery steps based on your situation. 3. Recovering an Account Using the “Forgot Password” Feature The most common way to recover a Facebook account is by using the “Forgot Password” tool. Steps: Go to facebook.com/login Click on “Forgotten password?”
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
OverviewThe BuzzFeed dataset, officially known as the BuzzFeed-Webis Fake News Corpus 2016, comprises content from 9 news publishers over a 7-day period close to the 2016 US election. It was created to analyze the spread of misinformation and hyperpartisan content on social media platforms, particularly Facebook.Dataset CompositionNews Articles: The dataset includes 1,627 articles from various sources:826 from mainstream publishers256 from left-wing publishers545 from right-wing publishersFacebook Posts: Each article is associated with Facebook post data, including metrics like share counts, reaction counts, and comment counts.Comments: The dataset includes nearly 1.7 million Facebook comments discussing the news content.Fact-Check Ratings: Each article was fact-checked by professional journalists at BuzzFeed, providing veracity assessments.Key FeaturesPublisher Information: The dataset covers 9 publishers, including 6 hyperpartisan (3 left-wing and 3 right-wing) and 3 mainstream outlets.Temporal Aspect: The data was collected over seven weekdays (September 19-23 and September 26-27, 2016).Verification Status: All publishers included in the dataset had earned Facebook's blue checkmark, indicating authenticity and elevated status.Metadata: Includes various metrics such as publication dates, post types, and engagement statistics.Potential ApplicationsThe BuzzFeed dataset is valuable for various research and analytical purposes:News Veracity Assessment: Researchers can use machine learning techniques to classify articles based on their factual accuracy.Social Media Analysis: The dataset allows for studying how news spreads on platforms like Facebook, including engagement patterns.Hyperpartisan Content Study: It enables analysis of differences between mainstream and hyperpartisan news sources.Content Strategy Optimization: Media companies can use insights from the dataset to refine their content strategies.Audience Analysis: The data can be used for demographic analysis and audience segmentation.This dataset provides a comprehensive snapshot of news dissemination and engagement on social media during a crucial period, making it a valuable resource for researchers, data scientists, and media analysts studying online information ecosystems.
https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
This dataset consists of circles (or friends lists ) from Facebook. Facebook data was collected from survey participants using this Facebook app. The dataset includes node features (profiles), circles, and ego networks.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Data sample for ExploreToM: Program-guided aversarial data generation for theory of mind reasoning
ExploreToM is the first framework to allow large-scale generation of diverse and challenging theory of mind data for robust training and evaluation. Our approach leverages an A* search over a custom domain-specific language to produce complex story structures and novel, diverse, yet plausible scenarios to stress test the limits of LLMs. Our A* search procedure aims to find particularly… See the full description on the dataset page: https://huggingface.co/datasets/facebook/ExploreToM.
All data required to reproduce results of "How much research shared on Facebook is hidden from public view?". More information about the manuscript, code, and reproducibility can be found here. This dataset contains five spreadsheets from two different sources: 1. Data collected with our own method described in Enkhbayar and Alperin (2018). More details and instructions can be found in this GitHub repository. plos_one_articles.csv: All articles published in PLOS ONE from 2015 - 2017 altmetric_counts.csv: POS and TW counts retrieved from Altmetric™ graph_api_counts.csv: AES counts collected with our methods using Facebook's Graph API query_details.csv: Responses from Graph API 2. Data provided by Piwowar et al. (2017) PLOS_2015-2017_idArt-DOI-PY-Journal-Title-LargerDiscipline-Discipline-Specialty.csv: Disciplinary categorisations for PLOS ONE publications as described in Piwowar et al. (2015) References Enkhbayar, A., & Alperin, J. P. (2018). Challenges of capturing engagement on Facebook for Altmetrics. STI 2018 Conference Proceedings, 1460–1469. Retrieved from http://arxiv.org/abs/1809.01194 Piwowar, H., Priem, J., Larivière, V., Alperin, J. P., Matthias, L., Norlander, B., … Haustein, S. (2018). The state of OA: A large-scale analysis of the prevalence and impact of Open Access articles. PeerJ, 6, e4375. doi: 10/ckh5
This dataset contains a random sample of 1000 Facebook image posts from a collection of Facebook public pages and groups in August, September, and October 2020.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
PE Video Dataset (PVD)
[📃 Tech Report] [📂 Github] The PE Video Dataset (PVD) is a large-scale collection of 1 million diverse videos, featuring 120,000+ expertly annotated clips. The dataset was introduced in our paper "Perception Encoder".
Overview
PE Video Dataset (PVD) comprises 1M high quality and diverse videos. Among them, 120K videos are accompanied by automated and human-verified annotations. and all videos are accompanied with video description and keywords.… See the full description on the dataset page: https://huggingface.co/datasets/facebook/PE-Video.
This statistical dataset contains estimates on the number of active online Facebook users living outside of their country of origin within the European Union. The dataset includes information on Facebook users' age, gender, country of residence, and country of previous residence. The data is divided in the number of Monthly Active Users and Daily Active Users. The data was collected through standard CSV format via an advertising API platform by using an R Studio code, and the data collection was conducted twice a month from January to November 2021.
The dataset was originally published in DiVA and moved to SND in 2024.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Social behavior has a fundamental impact on the dynamics of infectious diseases (such as COVID-19), challenging public health mitigation strategies and possibly the political consensus. The widespread use of the traditional and social media on the Internet provides us with an invaluable source of information on societal dynamics during pandemics. With this dataset, we aim to understand mechanisms of COVID-19 epidemic-related social behavior in Poland deploying methods of computational social science and digital epidemiology. We have collected and analyzed COVID-19 perception on the Polish language Internet during 15.01-31.07(06.08) and labeled data quantitatively (Twitter, Youtube, Articles) and qualitatively (Facebook, Articles and Comments of Article) in the Internet by infomediological approach.
-manually labelled 1000 most popular tweets (twits_annotated.xlsx) with cathegories is_fake (categorical and numeric) topic and sentiment;
-extracted 57,306 representative articles (articles_till_06_08.zip) in Polish using Eventregitry.org tool in language Polish and topic "Coronavirus" in article body;
extracted 1,015,199 (tweets_till_31_07_users.zip and tweets_till_31_07_text.zip) and Tweets from #Koronawirus in language Polish using Twitter API.
collected 1,574 videos (youtube_comments_till_31_07.zip and youtube_movie.csv) with keyword: Koronawirus on YouTube and 247,575 comments on them using Google API;
We supplemented the media observations with an analysis of 244 social empirical studies till 25.05 on COVID-19 in Poland (empirical_social_studies.csv).
Reports and analyzes and coding books can be found in Polish at: http://www.infodemia-koronawirusa.pl
Main report (in Polish) https://depot.ceon.pl/handle/123456789/19215
https://networkrepository.com/policy.phphttps://networkrepository.com/policy.php
Mutually liked facebook pages. Nodes represent the pages and edges are mutual likes among them. - Data collected about Facebook pages (November 2017). These datasets represent blue verified Facebook page networks of different categories. Nodes represent the pages and edges are mutual likes among them.
https://brightdata.com/licensehttps://brightdata.com/license
Gain valuable insights with our comprehensive Social Media Dataset, designed to help businesses, marketers, and analysts track trends, monitor engagement, and optimize strategies. This dataset provides structured and reliable social media data from multiple platforms.
Dataset Features
User Profiles: Access public social media profiles, including usernames, bios, follower counts, engagement metrics, and more. Ideal for audience analysis, influencer marketing, and competitive research. Posts & Content: Extract posts, captions, hashtags, media (images/videos), timestamps, and engagement metrics such as likes, shares, and comments. Useful for trend analysis, sentiment tracking, and content strategy optimization. Comments & Interactions: Analyze user interactions, including replies, mentions, and discussions. This data helps brands understand audience sentiment and engagement patterns. Hashtag & Trend Tracking: Monitor trending hashtags, topics, and viral content across platforms to stay ahead of industry trends and consumer interests.
Customizable Subsets for Specific Needs Our Social Media Dataset is fully customizable, allowing you to filter data based on platform, region, keywords, engagement levels, or specific user profiles. Whether you need a broad dataset for market research or a focused subset for brand monitoring, we tailor the dataset to your needs.
Popular Use Cases
Brand Monitoring & Reputation Management: Track brand mentions, customer feedback, and sentiment analysis to manage online reputation effectively. Influencer Marketing & Audience Analysis: Identify key influencers, analyze engagement metrics, and optimize influencer partnerships. Competitive Intelligence: Monitor competitor activity, content performance, and audience engagement to refine marketing strategies. Market Research & Consumer Insights: Analyze social media trends, customer preferences, and emerging topics to inform business decisions. AI & Predictive Analytics: Leverage structured social media data for AI-driven trend forecasting, sentiment analysis, and automated content recommendations.
Whether you're tracking brand sentiment, analyzing audience engagement, or monitoring industry trends, our Social Media Dataset provides the structured data you need. Get started today and customize your dataset to fit your business objectives.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The files of this dataset are no longer available. A revised version has been published at: https://doi.org/10.17026/dans-235-tba9The main goal of the DFS data collection project is to map the online friendship networks of Dutch adolescents. Specifically, the Facebook networks of Dutch adolescents participating in the offline CILS4EU and CILSNL data collection are mapped. Facebook is an American social networking site (SNS) where users create an online profile, provide personal information on this profile and invite other users to become connected as friends. With these connections, users can interact via personal messaging, post directly on others’ personal profile pages and react to others’ posts. During the time of our data collection, in 2014, Facebook was the largest SNS of the world with approximately 1.3 billion members. The DFS data are collected to study the relationship between offline face-to-face contacts, and online friendship network on Facebook. To this purpose we coded variables that show respondents’ Facebook friends’ gender, numbers of friends, privacy settings and ethnicity.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Election Facebook’s Ad Metrics 2024: Trump vs. Harris
A key event of 2024 is the U.S. presidential election. This project focuses on analyzing how Donald Trump and Kamala Harris use advertising to win votes, exploring their strategies, actions, and effectiveness.
Here is the Dataset i have used in the analytic:
File name: trump.zip and harris.zip (Original Data)
The files were downloaded from the Facebook Ad Library. The data focuses on two primary accounts: Trump and Harris, which had the highest number of advertisements and the largest ad spend. These accounts promoted two types of campaigns: presidential campaigns and victory funds. However, I will concentrate solely on the presidential campaigns. Date Range: Based on my research, presidential campaigns typically begin about a year before the election. Therefore, I collected data starting from February 25, 2023, the date Harris announced her candidacy to compete with Trump, up to the current date, December 7, 2024.
File name: Trump-Harris add-id.csv (Processed Data)
This is the main data of the "Election Facebook’s Ad Metrics 2024: Trump vs. Harris"
File name: AD-Tech-Analytic-Project-DashBoard.pbix
Power BI chart imported data from Trump-Harris add-id.csv (Processed Data) and some others
File name: 6state trump data.csv, datamichigan.csv, data nevada.csv
Data that filters from Trump-Harris add-id.csv (Processed Data) have been used in AD-Tech-Analytic-Project-DashBoard.pbix
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Dataset Details
Touch-Slide is a dataset inspired by YCB-Slide. Its purpose is to increase the amount of data from multiple DIGIT sensors for self-supervised learning (SSL) pre-training of the Sparsh model. Touch-Slide consists of human-sliding interactions on toy kitchen objects using the DIGIT sensor. We used 9 objects, as shown below, and collected 5 trajectories for each, resulting in a total of 180k frames.
This is a visual example of how the dataset was collected, showcasing… See the full description on the dataset page: https://huggingface.co/datasets/facebook/touch-slide.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
the dataset is collected from social media such as facebook and telegram. the dataset is further processed. the collection are orginal_cleaned: this dataset is neither stemed nor stopword are remove: stopword_removed: in this dataset stopwords are removed but not stemmed and in stemed datset is stemmed and stopwords are removed. stemming is done using hornmorpho developed by Michael Gesser( available at https://github.com/hltdi/HornMorpho) all datasets are normalized and free from noise such as punctuation marks and emojs.
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
Corpus consisting of 10,000 Facebook posts manually annotated on sentiment (2,587 positive, 5,174 neutral, 1,991 negative and 248 bipolar posts). The archive contains data and statistics in an Excel file (FBData.xlsx) and gold data in two text files with posts (gold-posts.txt) and labels (gols-labels.txt) on corresponding lines.
We collected data about Facebook pages (November 2017). These datasets represent blue verified Facebook page networks of different categories. Nodes represent the pages and edges are mutual likes among them. We reindexed the nodes in order to achieve a certain level of anonimity. The csv files contain the edges -- nodes are indexed from 0. We included 8 different distinct types of pages. These are listed below. For each dataset we listed the number of nodes an edges.