Facebook
Twitterhttps://brightdata.com/licensehttps://brightdata.com/license
Access our extensive Facebook datasets that provide detailed information on public posts, pages, and user engagement. Gain insights into post performance, audience interactions, page details, and content trends with our ethically sourced data. Free samples are available for evaluation. Over 940M records available Price starts at $250/100K records Data formats are available in JSON, NDJSON, CSV, XLSX and Parquet. 100% ethical and compliant data collection Included datapoints:
Post ID Post Content & URL Date Posted Hashtags Number of Comments Number of Shares Likes & Reaction Counts (by type) Video View Count Page Name & Category Page Followers & Likes Page Verification Status Page Website & Contact Info Is Sponsored Post Attachments (Images/Videos) External Link Data And much more
Facebook
Twitterhttps://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
171 million names (100 million unique) This torrent contains: The URL of every searchable Facebook user s profile The name of every searchable Facebook user, both unique and by count (perfect for post-processing, datamining, etc) Processed lists, including first names with count, last names with count, potential usernames with count, etc The programs I used to generate everything So, there you have it: lots of awesome data from Facebook. Now, I just have to find one more problem with Facebook so I can write "Revenge of the Facebook Snatchers" and complete the trilogy. Any suggestions? >:-) Limitations So far, I have only indexed the searchable users, not their friends. Getting their friends will be significantly more data to process, and I don t have those capabilities right now. I d like to tackle that in the future, though, so if anybody has any bandwidth they d like to donate, all I need is an ssh account and Nmap installed. An additional limitation is that these are on
Facebook
TwitterIn the first half of 2024, Facebook received over 99,000 user data requests from law enforcement agencies of India. The United States ranked second, with over 81,000 user data requests, followed by Brazil, with nearly 26,000 requests. During the measured period, a total of 324,000 requests were submitted to the social network.
Facebook
TwitterFacebook received ****** user data requests from federal agencies and courts in the United States during the second half of 2024. The social network produced some user data in ** percent of requests from U.S. federal authorities. The United States accounts for the largest share of Facebook user data requests worldwide.
Facebook
TwitterAttribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Facebook probably needs no introduction; nonetheless, here is a quick history of the company. The world’s biggest and most-famous social network was launched by Mark Zuckerberg while he was a...
Facebook
Twitterhttps://brightdata.com/licensehttps://brightdata.com/license
Use our Facebook Posts dataset to access detailed information about individual Facebook posts, including content, hashtags, engagement metrics, and page details like name, category, and followers. Popular use cases include analyzing user engagement, tracking content trends, and studying page dynamics for strategic insights. Over 31M records available Price starts at $250/100K records Data formats are available in JSON, NDJSON, CSV, XLSX and Parquet. 100% ethical and compliant data collection Included datapoints:
Post ID Post Content & URL Date Posted Hashtags Number of Comments Number of Shares Likes & Reaction Counts (by type) Video View Count Page Name & Category Page Followers & Likes Page Verification Status Page Website & Contact Info Is Sponsored Post Attachments (Images/Videos) External Link Data And much more
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
## Overview
Facebook is a dataset for instance segmentation tasks - it contains Face T0jN annotations for 1,943 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [MIT license](https://creativecommons.org/licenses/MIT).
Facebook
TwitterAttribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
Corpus consisting of 10,000 Facebook posts manually annotated on sentiment (2,587 positive, 5,174 neutral, 1,991 negative and 248 bipolar posts). The archive contains data and statistics in an Excel file (FBData.xlsx) and gold data in two text files with posts (gold-posts.txt) and labels (gols-labels.txt) on corresponding lines.
Facebook
Twitterhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.htmlhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html
Facebook is becoming an essential tool for more than just family and friends. Discover how Cheltenham Township (USA), a diverse community just outside of Philadelphia, deals with major issues such as the Bill Cosby trial, everyday traffic issues, sewer I/I problems and lost cats and dogs. And yes, theft.
Communities work when they're connected and exchanging information. What and who are the essential forces making a positive impact, and when and how do conversational threads get directed or misdirected?
Use Any Facebook Public Group
You can leverage the examples here for any public Facebook group. For an example of the source code used to collect this data, and a quick start docker image, take a look at the following project: facebook-group-scrape.
Data Sources
There are 4 csv files in the dataset, with data from the following 5 public Facebook groups:
post.csv
These are the main posts you will see on the page. It might help to take a quick look at the page. Commas in the msg field have been replaced with {COMMA}, and apostrophes have been replaced with {APOST}.
comment.csv
These are comments to the main post. Note, Facebook postings have comments, and comments on comments.
like.csv
These are likes and responses. The two keys in this file (pid,cid) will join to post and comment respectively.
member.csv
These are all the members in the group. Some members never, or rarely, post or comment. You may find multiple entries in this table for the same person. The name of the individual never changes, but they change their profile picture. Each profile picture change is captured in this table. Facebook gives users a new id in this table when they change their profile picture.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
The LiLaH-HAG dataset (HAG is short for hate-age-gender) consists of metadata on Facebook comments to Facebook posts of mainstream media in Great Britain, Flanders, Slovenia and Croatia. The metadata available in the dataset are the hatefulness of the comment (0 is acceptable, 1 is hateful), age of the commenter (0-25, 26-30, 36-65, 65-), gender of the commenter (M or F), and the language in which the comment was written (EN, NL, SL, HR).
The hatefulness of the comment was assigned by multiple well-trained annotators by reading comments in the order of appearance in a discussion thread, while the age and gender variables were estimated from the Facebook profile of a specific user by a single annotator.
Facebook
Twitterhttps://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
This dataset consists of circles (or friends lists ) from Facebook. Facebook data was collected from survey participants using this Facebook app. The dataset includes node features (profiles), circles, and ego networks.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
36.8% of the entire world’s population uses Facebook at least once per month.
Facebook
TwitterDataset Card for Winoground
Dataset Description
Winoground is a novel task and dataset for evaluating the ability of vision and language models to conduct visio-linguistic compositional reasoning. Given two images and two captions, the goal is to match them correctly—but crucially, both captions contain a completely identical set of words/morphemes, only in a different order. The dataset was carefully hand-curated by expert annotators and is labeled with a rich set of… See the full description on the dataset page: https://huggingface.co/datasets/facebook/winoground.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset contains a random sample of 1000 Facebook image posts from a collection of Facebook public pages and groups in August, September, and October 2020.
Facebook
TwitterAll data required to reproduce results of "How much research shared on Facebook is hidden from public view?". More information about the manuscript, code, and reproducibility can be found here. This dataset contains five spreadsheets from two different sources: 1. Data collected with our own method described in Enkhbayar and Alperin (2018). More details and instructions can be found in this GitHub repository. plos_one_articles.csv: All articles published in PLOS ONE from 2015 - 2017 altmetric_counts.csv: POS and TW counts retrieved from Altmetric™ graph_api_counts.csv: AES counts collected with our methods using Facebook's Graph API query_details.csv: Responses from Graph API 2. Data provided by Piwowar et al. (2017) PLOS_2015-2017_idArt-DOI-PY-Journal-Title-LargerDiscipline-Discipline-Specialty.csv: Disciplinary categorisations for PLOS ONE publications as described in Piwowar et al. (2015) References Enkhbayar, A., & Alperin, J. P. (2018). Challenges of capturing engagement on Facebook for Altmetrics. STI 2018 Conference Proceedings, 1460–1469. Retrieved from http://arxiv.org/abs/1809.01194 Piwowar, H., Priem, J., Larivière, V., Alperin, J. P., Matthias, L., Norlander, B., … Haustein, S. (2018). The state of OA: A large-scale analysis of the prevalence and impact of Open Access articles. PeerJ, 6, e4375. doi: 10/ckh5
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
PE Video Dataset (PVD)
[📃 Tech Report] [📂 Github] The PE Video Dataset (PVD) is a large-scale collection of 1 million diverse videos, featuring 120,000+ expertly annotated clips. The dataset was introduced in our paper "Perception Encoder".
Overview
PE Video Dataset (PVD) comprises 1M high quality and diverse videos. Among them, 120K videos are accompanied by automated and human-verified annotations. and all videos are accompanied with video description and keywords.… See the full description on the dataset page: https://huggingface.co/datasets/facebook/PE-Video.
Facebook
Twitterhttps://sqmagazine.co.uk/privacy-policy/https://sqmagazine.co.uk/privacy-policy/
In a small café in Austin, Texas, a 68-year-old grandmother shares reels of her garden with her granddaughter, who lives in Tokyo. Meanwhile, a high school student in Nairobi livestreams his gaming tutorial to friends across the world. Behind these everyday moments is Facebook, the digital backbone connecting over 3...
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Overview
FACTORY is a large-scale, human-verified, and challenging prompt set. We employ a model-in-the-loop approach to ensure quality and address the complexities of evaluating long-form generation. Starting with seed topics from Wikipedia, we expand each topic into a diverse set of prompts using large language models (LLMs). We then apply the model-in-the-loop method to filter out simpler prompts, maintaining a high level of difficulty. Human annotators further refine the prompts… See the full description on the dataset page: https://huggingface.co/datasets/facebook/FACTORY.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Community Alignment
Github |
Paper
Dataset
Community Alignment is a large-scale open source, multilingual and multi-turn preference dataset to align LLMs with human preferences across cultures. It features prompt-level overlap in annotators, enabling social-choice-based and distributional approaches to LLM alignment, as well as natural language explanations for choices.
[Large-scale] ~200,000 comparisons of LLM responses, collected from >3,000 unique annotators who… See the full description on the dataset page: https://huggingface.co/datasets/facebook/community-alignment-dataset.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Facebook Ads PixelPro9 SponsoredTexts is a dataset for object detection tasks - it contains Words LVq8 MDMT annotations for 1,663 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
Twitterhttps://brightdata.com/licensehttps://brightdata.com/license
Access our extensive Facebook datasets that provide detailed information on public posts, pages, and user engagement. Gain insights into post performance, audience interactions, page details, and content trends with our ethically sourced data. Free samples are available for evaluation. Over 940M records available Price starts at $250/100K records Data formats are available in JSON, NDJSON, CSV, XLSX and Parquet. 100% ethical and compliant data collection Included datapoints:
Post ID Post Content & URL Date Posted Hashtags Number of Comments Number of Shares Likes & Reaction Counts (by type) Video View Count Page Name & Category Page Followers & Likes Page Verification Status Page Website & Contact Info Is Sponsored Post Attachments (Images/Videos) External Link Data And much more