100+ datasets found

Facebook Datasets
brightdata.com
.json, .csv, .xlsx
Updated Jan 27, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bright Data (2023). Facebook Datasets [Dataset]. https://brightdata.com/products/datasets/facebook
Explore at:
.json, .csv, .xlsxAvailable download formats
Dataset updated
Jan 27, 2023
Dataset authored and provided by
Bright Datahttps://brightdata.com/
License
https://brightdata.com/licensehttps://brightdata.com/license
Area covered
Worldwide
Description
Access our extensive Facebook datasets that provide detailed information on public posts, pages, and user engagement. Gain insights into post performance, audience interactions, page details, and content trends with our ethically sourced data. Free samples are available for evaluation. Over 940M records available Price starts at $250/100K records Data formats are available in JSON, NDJSON, CSV, XLSX and Parquet. 100% ethical and compliant data collection Included datapoints:

Post ID Post Content & URL Date Posted Hashtags Number of Comments Number of Shares Likes & Reaction Counts (by type) Video View Count Page Name & Category Page Followers & Likes Page Verification Status Page Website & Contact Info Is Sponsored Post Attachments (Images/Videos) External Link Data And much more
a
Facebook Names Dataset
academictorrents.com
bittorrent
Updated Nov 11, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ron Bowes (Skull Security) (2015). Facebook Names Dataset [Dataset]. https://academictorrents.com/details/e54c73099d291605e7579b90838c2cd86a8e9575
Explore at:
bittorrent(2991052604)Available download formats
Dataset updated
Nov 11, 2015
Dataset authored and provided by
Ron Bowes (Skull Security)
License
https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
Description
171 million names (100 million unique) This torrent contains: The URL of every searchable Facebook user s profile The name of every searchable Facebook user, both unique and by count (perfect for post-processing, datamining, etc) Processed lists, including first names with count, last names with count, potential usernames with count, etc The programs I used to generate everything So, there you have it: lots of awesome data from Facebook. Now, I just have to find one more problem with Facebook so I can write "Revenge of the Facebook Snatchers" and complete the trilogy. Any suggestions? >:-) Limitations So far, I have only indexed the searchable users, not their friends. Getting their friends will be significantly more data to process, and I don t have those capabilities right now. I d like to tackle that in the future, though, so if anybody has any bandwidth they d like to donate, all I need is an ssh account and Nmap installed. An additional limitation is that these are on
Facebook user data requests from federal agencies & governments H2 2024, by...
statista.com
tokrwards.com
+4more
Updated Aug 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Facebook user data requests from federal agencies & governments H2 2024, by country [Dataset]. https://www.statista.com/statistics/287845/global-data-requests-from-facebook-by-federal-agencies-and-governments/
Explore at:
Dataset updated
Aug 4, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Worldwide
Description
In the first half of 2024, Facebook received over 99,000 user data requests from law enforcement agencies of India. The United States ranked second, with over 81,000 user data requests, followed by Brazil, with nearly 26,000 requests. During the measured period, a total of 324,000 requests were submitted to the social network.
U.S. Facebook data requests from government agencies 2013-2024
statista.com
tokrwards.com
Updated Aug 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). U.S. Facebook data requests from government agencies 2013-2024 [Dataset]. https://www.statista.com/statistics/879006/us-data-requests-facebook-federal-agencies-and-governments/
Explore at:
Dataset updated
Aug 4, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
United States
Description
Facebook received ****** user data requests from federal agencies and courts in the United States during the second half of 2024. The social network produced some user data in ** percent of requests from U.S. federal authorities. The United States accounts for the largest share of Facebook user data requests worldwide.
b
Facebook Revenue and Usage Statistics (2025)
businessofapps.com
Updated Aug 8, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Business of Apps (2017). Facebook Revenue and Usage Statistics (2025) [Dataset]. https://www.businessofapps.com/data/facebook-statistics/
Explore at:
Dataset updated
Aug 8, 2017
Dataset authored and provided by
Business of Apps
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
Facebook probably needs no introduction; nonetheless, here is a quick history of the company. The world’s biggest and most-famous social network was launched by Mark Zuckerberg while he was a...
Data from: Facebook Posts Datasets
brightdata.com
.json, .csv, .xlsx
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bright Data, Facebook Posts Datasets [Dataset]. https://brightdata.com/products/datasets/facebook/posts
Explore at:
.json, .csv, .xlsxAvailable download formats
Dataset authored and provided by
Bright Datahttps://brightdata.com/
License
https://brightdata.com/licensehttps://brightdata.com/license
Area covered
Worldwide
Description
Use our Facebook Posts dataset to access detailed information about individual Facebook posts, including content, hashtags, engagement metrics, and page details like name, category, and followers. Popular use cases include analyzing user engagement, tracking content trends, and studying page dynamics for strategic insights. Over 31M records available Price starts at $250/100K records Data formats are available in JSON, NDJSON, CSV, XLSX and Parquet. 100% ethical and compliant data collection Included datapoints:

Post ID Post Content & URL Date Posted Hashtags Number of Comments Number of Shares Likes & Reaction Counts (by type) Video View Count Page Name & Category Page Followers & Likes Page Verification Status Page Website & Contact Info Is Sponsored Post Attachments (Images/Videos) External Link Data And much more
R
Facebook Dataset
universe.roboflow.com
zip
Updated Jul 25, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Samaritan (2025). Facebook Dataset [Dataset]. https://universe.roboflow.com/samaritan/facebook-6nrcn
Explore at:
zipAvailable download formats
Dataset updated
Jul 25, 2025
Dataset authored and provided by
Samaritan
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Variables measured
Face T0jN Polygons
Description
Facebook

## Overview Facebook is a dataset for instance segmentation tasks - it contains Face T0jN annotations for 1,943 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [MIT license](https://creativecommons.org/licenses/MIT).
E
Data from: Facebook Data for Sentiment Analysis
live.european-language-grid.eu
binary format
Updated Jul 16, 2013
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2013). Facebook Data for Sentiment Analysis [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/1057
Explore at:
binary formatAvailable download formats
Dataset updated
Jul 16, 2013
License
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
Description
Corpus consisting of 10,000 Facebook posts manually annotated on sentiment (2,587 positive, 5,174 neutral, 1,991 negative and 248 bipolar posts). The archive contains data and statistics in an Excel file (FBData.xlsx) and gold data in two text files with posts (gold-posts.txt) and labels (gols-labels.txt) on corresponding lines.
Cheltenham's Facebook Groups
kaggle.com
zip
Updated Apr 2, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mike Chirico (2018). Cheltenham's Facebook Groups [Dataset]. https://www.kaggle.com/datasets/mchirico/cheltenham-s-facebook-group
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Apr 2, 2018
Authors
Mike Chirico
License
http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.htmlhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html
Description
Facebook is becoming an essential tool for more than just family and friends. Discover how Cheltenham Township (USA), a diverse community just outside of Philadelphia, deals with major issues such as the Bill Cosby trial, everyday traffic issues, sewer I/I problems and lost cats and dogs. And yes, theft.

Communities work when they're connected and exchanging information. What and who are the essential forces making a positive impact, and when and how do conversational threads get directed or misdirected?

Use Any Facebook Public Group

You can leverage the examples here for any public Facebook group. For an example of the source code used to collect this data, and a quick start docker image, take a look at the following project: facebook-group-scrape.

Data Sources

There are 4 csv files in the dataset, with data from the following 5 public Facebook groups:

Unofficial Cheltenham Township

Elkins Park Happenings!

Free Speech Zone

Cheltenham Lateral Solutions

Cheltenham Township Residents

post.csv

These are the main posts you will see on the page. It might help to take a quick look at the page. Commas in the msg field have been replaced with {COMMA}, and apostrophes have been replaced with {APOST}.

gid Group id (5 different Facebook groups)

pid Main Post id

id Id of the user posting

name User's name

timeStamp

shares

url

msg Text of the message posted.

likes Number of likes

comment.csv

These are comments to the main post. Note, Facebook postings have comments, and comments on comments.

gid Group id

pid Matches Main Post identifier in post.csv

cid Comment Id.

timeStamp

id Id of user commenting

name Name of user commenting

rid Id of user responding to first comment

msg Message

like.csv

These are likes and responses. The two keys in this file (pid,cid) will join to post and comment respectively.

gid Group id

pid Matches Main Post identifier in post.csv

cid Matches Comments id.

response Response such as LIKE, ANGRY etc.

id The id of user responding

name Name of the user responding

member.csv

These are all the members in the group. Some members never, or rarely, post or comment. You may find multiple entries in this table for the same person. The name of the individual never changes, but they change their profile picture. Each profile picture change is captured in this table. Facebook gives users a new id in this table when they change their profile picture.

gid Group id

id Id of the member

name Name of the member

url URL of the member
E
Facebook metadata dataset LiLaH-HAG
live.european-language-grid.eu
repository.uantwerpen.be
binary format
Updated Aug 23, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Facebook metadata dataset LiLaH-HAG [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/20476
Explore at:
binary formatAvailable download formats
Dataset updated
Aug 23, 2022
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
The LiLaH-HAG dataset (HAG is short for hate-age-gender) consists of metadata on Facebook comments to Facebook posts of mainstream media in Great Britain, Flanders, Slovenia and Croatia. The metadata available in the dataset are the hatefulness of the comment (0 is acceptable, 1 is hateful), age of the commenter (0-25, 26-30, 36-65, 65-), gender of the commenter (M or F), and the language in which the comment was written (EN, NL, SL, HR).

The hatefulness of the comment was assigned by multiple well-trained annotators by reading comments in the order of appearance in a discussion thread, while the age and gender variables were estimated from the Facebook profile of a specific user by a single annotator.
a
Facebook SNAP Network Data
academictorrents.com
bittorrent
Updated Nov 22, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stanford Network Analysis Platform (SNAP) (2015). Facebook SNAP Network Data [Dataset]. https://academictorrents.com/details/3efc53f35d49669b89039f2b4ec9de11ec1d73fd
Explore at:
bittorrent(951514)Available download formats
Dataset updated
Nov 22, 2015
Dataset authored and provided by
Stanford Network Analysis Platform (SNAP)
License
https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
Description
This dataset consists of circles (or friends lists ) from Facebook. Facebook data was collected from survey participants using this Facebook app. The dataset includes node features (profiles), circles, and ego networks.
s
Facebook Usage: Who Uses Facebook?
searchlogistics.com
Updated Mar 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Facebook Usage: Who Uses Facebook? [Dataset]. https://www.searchlogistics.com/learn/statistics/facebook-advertising-statistics/
Explore at:
Dataset updated
Mar 17, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
36.8% of the entire world’s population uses Facebook at least once per month.
h
winoground
huggingface.co
Updated Dec 12, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AI at Meta (2023). winoground [Dataset]. https://huggingface.co/datasets/facebook/winoground
Explore at:
Dataset updated
Dec 12, 2023
Dataset authored and provided by
AI at Meta
Description
Dataset Card for Winoground

Dataset Description

Winoground is a novel task and dataset for evaluating the ability of vision and language models to conduct visio-linguistic compositional reasoning. Given two images and two captions, the goal is to match them correctly—but crucially, both captions contain a completely identical set of words/morphemes, only in a different order. The dataset was carefully hand-curated by expert annotators and is labeled with a rich set of… See the full description on the dataset page: https://huggingface.co/datasets/facebook/winoground.
H
Facebook image data
dataverse.harvard.edu
search.dataone.org
Updated Mar 4, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yunkang Yang; Matthew hindman; Trevor Davis (2022). Facebook image data [Dataset]. http://doi.org/10.7910/DVN/RNITKF
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/RNITKF
Dataset updated
Mar 4, 2022
Dataset provided by
Harvard Dataverse
Authors
Yunkang Yang; Matthew hindman; Trevor Davis
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This dataset contains a random sample of 1000 Facebook image posts from a collection of Facebook public pages and groups in August, September, and October 2020.
d
Data for: How much research shared on Facebook is hidden from public view?
search.dataone.org
dataverse.harvard.edu
Updated Nov 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Enkhbayar, Asura; Haustein, Stefanie; Alperin, Juan Pablo (2023). Data for: How much research shared on Facebook is hidden from public view? [Dataset]. http://doi.org/10.7910/DVN/3CS5ES
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/3CS5ES
Dataset updated
Nov 22, 2023
Dataset provided by
Harvard Dataverse
Authors
Enkhbayar, Asura; Haustein, Stefanie; Alperin, Juan Pablo
Time period covered
Jan 1, 2015 - Jan 1, 2017
Description
All data required to reproduce results of "How much research shared on Facebook is hidden from public view?". More information about the manuscript, code, and reproducibility can be found here. This dataset contains five spreadsheets from two different sources: 1. Data collected with our own method described in Enkhbayar and Alperin (2018). More details and instructions can be found in this GitHub repository. plos_one_articles.csv: All articles published in PLOS ONE from 2015 - 2017 altmetric_counts.csv: POS and TW counts retrieved from Altmetric™ graph_api_counts.csv: AES counts collected with our methods using Facebook's Graph API query_details.csv: Responses from Graph API 2. Data provided by Piwowar et al. (2017) PLOS_2015-2017_idArt-DOI-PY-Journal-Title-LargerDiscipline-Discipline-Specialty.csv: Disciplinary categorisations for PLOS ONE publications as described in Piwowar et al. (2015) References Enkhbayar, A., & Alperin, J. P. (2018). Challenges of capturing engagement on Facebook for Altmetrics. STI 2018 Conference Proceedings, 1460–1469. Retrieved from http://arxiv.org/abs/1809.01194 Piwowar, H., Priem, J., Larivière, V., Alperin, J. P., Matthias, L., Norlander, B., … Haustein, S. (2018). The state of OA: A large-scale analysis of the prevalence and impact of Open Access articles. PeerJ, 6, e4375. doi: 10/ckh5
h
PE-Video
huggingface.co
Updated Jun 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AI at Meta (2025). PE-Video [Dataset]. https://huggingface.co/datasets/facebook/PE-Video
Explore at:
Dataset updated
Jun 1, 2025
Dataset authored and provided by
AI at Meta
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
PE Video Dataset (PVD)

[📃 Tech Report] [📂 Github] The PE Video Dataset (PVD) is a large-scale collection of 1 million diverse videos, featuring 120,000+ expertly annotated clips. The dataset was introduced in our paper "Perception Encoder".

Overview

PE Video Dataset (PVD) comprises 1M high quality and diverse videos. Among them, 120K videos are accompanied by automated and human-verified annotations. and all videos are accompanied with video description and keywords.… See the full description on the dataset page: https://huggingface.co/datasets/facebook/PE-Video.
S
Facebook Statistics 2025: Users, Revenue, and Engagement Trends Explained
sqmagazine.co.uk
Updated Oct 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SQ Magazine (2025). Facebook Statistics 2025: Users, Revenue, and Engagement Trends Explained [Dataset]. https://sqmagazine.co.uk/facebook-statistics/
Explore at:
Dataset updated
Oct 2, 2025
Dataset authored and provided by
SQ Magazine
License
https://sqmagazine.co.uk/privacy-policy/https://sqmagazine.co.uk/privacy-policy/
Time period covered
Jan 1, 2024 - Dec 31, 2025
Area covered
Global
Description
In a small café in Austin, Texas, a 68-year-old grandmother shares reels of her garden with her granddaughter, who lives in Tokyo. Meanwhile, a high school student in Nairobi livestreams his gaming tutorial to friends across the world. Behind these everyday moments is Facebook, the digital backbone connecting over 3...
h
FACTORY
huggingface.co
Updated Jul 31, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AI at Meta (2025). FACTORY [Dataset]. https://huggingface.co/datasets/facebook/FACTORY
Explore at:
Dataset updated
Jul 31, 2025
Dataset authored and provided by
AI at Meta
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Overview

FACTORY is a large-scale, human-verified, and challenging prompt set. We employ a model-in-the-loop approach to ensure quality and address the complexities of evaluating long-form generation. Starting with seed topics from Wikipedia, we expand each topic into a diverse set of prompts using large language models (LLMs). We then apply the model-in-the-loop method to filter out simpler prompts, maintaining a high level of difficulty. Human annotators further refine the prompts… See the full description on the dataset page: https://huggingface.co/datasets/facebook/FACTORY.
h
community-alignment-dataset
huggingface.co
Updated Jul 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AI at Meta (2025). community-alignment-dataset [Dataset]. https://huggingface.co/datasets/facebook/community-alignment-dataset
Explore at:
Dataset updated
Jul 16, 2025
Dataset authored and provided by
AI at Meta
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Community Alignment

Github |
Paper

Dataset

Community Alignment is a large-scale open source, multilingual and multi-turn preference dataset to align LLMs with human preferences across cultures. It features prompt-level overlap in annotators, enabling social-choice-based and distributional approaches to LLM alignment, as well as natural language explanations for choices.

[Large-scale] ~200,000 comparisons of LLM responses, collected from >3,000 unique annotators who… See the full description on the dataset page: https://huggingface.co/datasets/facebook/community-alignment-dataset.
R
Facebook Ads Pixelpro9 Sponsoredtexts Dataset
universe.roboflow.com
zip
Updated Apr 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AbdulsPersonalRoboflowWorkspace (2025). Facebook Ads Pixelpro9 Sponsoredtexts Dataset [Dataset]. https://universe.roboflow.com/abdulspersonalroboflowworkspace/facebook-ads-pixelpro9-sponsoredtexts/dataset/1
Explore at:
zipAvailable download formats
Dataset updated
Apr 3, 2025
Dataset authored and provided by
AbdulsPersonalRoboflowWorkspace
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Words LVq8 MDMT Bounding Boxes
Description
Facebook Ads PixelPro9 SponsoredTexts

## Overview Facebook Ads PixelPro9 SponsoredTexts is a dataset for object detection tasks - it contains Words LVq8 MDMT annotations for 1,663 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).

Facebook

Twitter

Click to copy link

Link copied

Cite

Bright Data (2023). Facebook Datasets [Dataset]. https://brightdata.com/products/datasets/facebook

Facebook Datasets

Explore at:

.json, .csv, .xlsxAvailable download formats

Dataset updated

Jan 27, 2023

Dataset authored and provided by

Bright Datahttps://brightdata.com/

License

https://brightdata.com/licensehttps://brightdata.com/license

Area covered

Worldwide

Description

Access our extensive Facebook datasets that provide detailed information on public posts, pages, and user engagement. Gain insights into post performance, audience interactions, page details, and content trends with our ethically sourced data. Free samples are available for evaluation. Over 940M records available Price starts at $250/100K records Data formats are available in JSON, NDJSON, CSV, XLSX and Parquet. 100% ethical and compliant data collection Included datapoints:

Post ID Post Content & URL Date Posted Hashtags Number of Comments Number of Shares Likes & Reaction Counts (by type) Video View Count Page Name & Category Page Followers & Likes Page Verification Status Page Website & Contact Info Is Sponsored Post Attachments (Images/Videos) External Link Data And much more

Clear search

Close search

Google apps

Main menu

Facebook Datasets

Facebook Names Dataset

Facebook user data requests from federal agencies & governments H2 2024, by...

U.S. Facebook data requests from government agencies 2013-2024

Facebook Revenue and Usage Statistics (2025)

Data from: Facebook Posts Datasets

Facebook Dataset

Facebook

Data from: Facebook Data for Sentiment Analysis

Cheltenham's Facebook Groups

Facebook metadata dataset LiLaH-HAG

Facebook SNAP Network Data

Facebook Usage: Who Uses Facebook?

winoground

Facebook image data

Data for: How much research shared on Facebook is hidden from public view?

PE-Video

Facebook Statistics 2025: Users, Revenue, and Engagement Trends Explained

FACTORY

community-alignment-dataset

Facebook Ads Pixelpro9 Sponsoredtexts Dataset

Facebook Ads PixelPro9 SponsoredTexts

Facebook Datasets