https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
About Dataset This dataset contains real and fake images of human faces. Real and Fake Face Detection Fake Face Photos by Photoshop Experts Introduction When using social networks, have you ever encountered a 'fake identity'? Anyone can create a fake profile image using image editing tools, or even using deep learning based generators. If you are interested in making the world wide web a better place by recognizing such fake faces, you should check this dataset.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Fake Face Vs Real Face is a dataset for object detection tasks - it contains Fake Face annotations for 494 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Deepfake is a technology that creates fake images and videos that cannot easily be distinguished from fact. This helps in cinematography and hiding the identity of the witness. It can be used negatively to spread false and fake news in political campaigns, leading to major social problems. A face mask has become a necessity in daily life since the outbreak of the COVID-19 virus in 2020. Because people wear these masks that hide their faces, fake video clips can spread widely. This increases the need for deepfake detection under these circumstances. This dataset is the first dataset with a face mask in the field of DeepFake. You can use the dataset to predict whether the video of a person wearing a face mask is a deepfake or not.
kenjon/deep-fake-face-swap dataset hosted on Hugging Face and contributed by the HF Datasets community
Download the 130K Real vs Fake Face Dataset for AI deepfake detection, face recognition, and ML research. Train smarter AI models today!
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Fake Face Detector 2.0 is a dataset for object detection tasks - it contains Authenticity annotations for 602 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
TrueFace is a first dataset of social media processed real and synthetic faces, obtained by the successful StyleGAN generative models, and shared on Facebook, Twitter and Telegram.
Images have historically been a universal and cross-cultural communication medium, capable of reaching people of any social background, status or education. Unsurprisingly though, their social impact has often been exploited for malicious purposes, like spreading misinformation and manipulating public opinion. With today's technologies, the possibility to generate highly realistic fakes is within everyone's reach. A major threat derives in particular from the use of synthetically generated faces, which are able to deceive even the most experienced observer. To contrast this fake news phenomenon, researchers have employed artificial intelligence to detect synthetic images by analysing patterns and artifacts introduced by the generative models. However, most online images are subject to repeated sharing operations by social media platforms. Said platforms process uploaded images by applying operations (like compression) that progressively degrade those useful forensic traces, compromising the effectiveness of the developed detectors. To solve the synthetic-vs-real problem "in the wild", more realistic image databases, like TrueFace, are needed to train specialised detectors.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
All the images of faces here are generated using https://thispersondoesnotexist.com/
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1842206%2F4c3d3569f4f9c12fc898d76390f68dab%2FBeFunky-collage.jpg?generation=1662079836729388&alt=media" alt="">
Under US copyright law, these images are technically not subject to copyright protection. Only "original works of authorship" are considered. "To qualify as a work of 'authorship' a work must be created by a human being," according to a US Copyright Office's report [PDF].
https://www.theregister.com/2022/08/14/ai_digital_artwork_copyright/
I manually tagged all images as best as I could and separated them between the two classes below
Some may pass either female or male, but I will leave it to you to do the reviewing. I included toddlers and babies under Male/ Female
Each of the faces are totally fake, created using an algorithm called Generative Adversarial Networks (GANs).
A generative adversarial network (GAN) is a class of machine learning frameworks designed by Ian Goodfellow and his colleagues in June 2014. Two neural networks contest with each other in a game (in the form of a zero-sum game, where one agent's gain is another agent's loss).
Given a training set, this technique learns to generate new data with the same statistics as the training set. For example, a GAN trained on photographs can generate new photographs that look at least superficially authentic to human observers, having many realistic characteristics. Though originally proposed as a form of generative model for unsupervised learning, GANs have also proved useful for semi-supervised learning, fully supervised learning,and reinforcement learning.
Just a simple Jupyter notebook that looped and invoked the website https://thispersondoesnotexist.com/ , saving all images locally
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The motivation behind the creation of this dataset is to have a challenging Test set for the task of classifying fake and real human faces. Most of the available datasets on Kaggle are "Uniform" and doesn't present a good variance of face features, particularly for the "Fake" class. Moreover, the fake faces collected in this dataset are generated using the StyleGAN2, which present a harder challenge to classify them correctly even for the human eye. The real human faces in this dataset are gathered so that we have a fair representation of different features(Age, Sex, Makeup, Ethnicity, etc...) that may be encountered in a production setup.
The images available in this dataset are in a JPEG format and of uniform size of 300x300. There "Fake" faces are collected from the website thispersondoesnotexist.com. There "Real" faces images are collected through the API of the website Unsplash and then the faces are cropped out of using OpenCV library.
Total number of images: 1288 Number of "Fake" faces: 700 Number of "Real" faces: 589
The data.csv
contains the images Id and the corresponding label.
Can you achieve a high accuracy on this dataset?
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Anti Spoofing Face Fake is a dataset for object detection tasks - it contains Face annotations for 3,158 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset has been curated for the purpose of identifying manipulated face-swapping videos. It includes a wide range of image frames extracted from both genuine and altered video content.
📌 Dataset Overview The dataset is organized into two categories:
real/ – Contains frames from authentic, unaltered videos
fake/ – Contains frames from videos that have undergone face-swapping manipulations
Each folder contains a large number of .png image files, with consistent formatting to ensure ease of use for machine learning and computer vision research.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
2023 Fake or Real: AI-generated Image Discrimination Competition dataset is now available on Hugging Face!
Hello🖐️ We are excited to announce the release of the dataset for the 2023 Fake or Real: AI-generated Image Discrimination Competition. The competition was held on AI CONNECT(https://aiconnect.kr/) from June 26th to July 6th, 2023, with 768 participants. If you're interested in evaluating the performance of your model on the test dataset, we encourage you to visit the… See the full description on the dataset page: https://huggingface.co/datasets/mncai/Fake_or_Real_Competition_Dataset.
https://github.com/EricGzq/Hybrid-Fake-Face-Datasethttps://github.com/EricGzq/Hybrid-Fake-Face-Dataset
We build a hybrid fake face (HFF) dataset, which contains eight types of face images. For real face images, three types of face images are randomly selected from three open datasets. They are low-resolution face images from CelebA, high-resolution face images from CelebA-HQ, and face video frames from FaceForensics, respectively. Thus, real face images under internet scenarios are simulated as real as possible. Then, some most representative face manipulation techniques, which include PGGAN and StyleGAN for identity manipulation, Face2Face and Glow for face expression manipulation, and StarGAN for face attribute transfer, are selected to produce fake face images. The HFF dataset is a large fake face dataset, which contains more than 155k face images.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset was collected in 2023 and comprises electroencephalography, physiological and behavioural data acquired from 73 healthy individuals (ages: 21-45). The task was administered as part of a larger study.
The objective of the study was to investigate if emotional arousal would affect people's perceived realness of others' faces, given ambiguous information. To manipulate participants' emotional arousal, images of angry (high emotionality) and neutral (low emotionality) faces (selected based on the their rated intensity from the NimStim Set of Facial Expressions (Tottenham et al., 2009)), were used as subliminal primes and facial images from the Multi-Racial Mega-Resolution database (Strohminger et al., 2016) were used as target stimuli. Blank screens were flashed prior to the target presentation in control trials. Forward and backward masks, generated by scrambling the primes, were implemented to prevent the primes from breaking awareness.
Each participant underwent a total of 222 trials, comprising of a forward mask,followed by the prime and backward mask, before the presentation of the target stimuli. The primes and targets were presented in a randomized order and trials were administered over a course of 3 blocks, between which participants were given a break to rest before proceeding to the next block of trials. During the presentation of the target stimulus, participants were instructed to indicate whether they thought the target was real or fake in a limited span of time (750ms), after which participants rated their confidence in their response using a sliding scale (0-100).
EEG signals were recorded using the EasyCap 64-channel and BrainVision Recording system. Electrodes were placed on the EEG cap according to the standard 10-5 system of electrode placement (Oostenveld & Praamsrta, 2001) and impedance was kept below 12 kOhm for each subject. The ground electrode was placed on the forehead the Cz was used as the reference channel. During recording, the sampling rate was 10000Hz. Note that channels Tp9 and Tp10 were placed near the outer canthi of each eye, and POz as well as Oz were fixed above and below one of the eyes to measure the E0G.
Participants' physiological signals, that is their electrocardiogram (ECG), photoplethysmograph (PPG) and respiration signals (RSP), were obtained at a sampling frequency of 1000Hz. All physiological signals were recorded via the PLUX OpenSignals software and BITalino Toolkit.
ECG was collected using three ECG electrodes placed according to a modified Lead II configuration, and RSP was acquired using a respiration belt tightened over participants' upper abdomen. PPG sensors, which record changes in blood volume, were clipped on the tip of the index finger of participants' non-dominant hand to meaure heart rate and oxygen saturation.
Appelhoff, S., Sanderson, M., Brooks, T., Vliet, M., Quentin, R., Holdgraf, C., Chaumon, M., Mikulan, E., Tavabi, K., Höchenberger, R., Welke, D., Brunner, C., Rockhill, A., Larson, E., Gramfort, A. and Jas, M. (2019). MNE-BIDS: Organizing electrophysiological data into the BIDS format and facilitating their analysis. Journal of Open Source Software 4: (1896). https://doi.org/10.21105/joss.01896
Pernet, C. R., Appelhoff, S., Gorgolewski, K. J., Flandin, G., Phillips, C., Delorme, A., Oostenveld, R. (2019). EEG-BIDS, an extension to the brain imaging data structure for electroencephalography. Scientific Data, 6, 103. https://doi.org/10.1038/s41597-019-0104-8
Live Face Anti-Spoof Dataset
A live face dataset is crucial for advancing computer vision tasks such as face detection, anti-spoofing detection, and face recognition. The Live Face Anti-Spoof Dataset offered by Ainnotate is specifically designed to train algorithms for anti-spoofing purposes, ensuring that AI systems can accurately differentiate between real and fake faces in various scenarios.
Key Features:
Comprehensive Video Collection: The dataset features thousands of videos showcasing a diverse range of individuals, including males and females, with and without glasses. It also includes men with beards, mustaches, and clean-shaven faces. Lighting Conditions: Videos are captured in both indoor and outdoor environments, ensuring that the data covers a wide range of lighting conditions, making it highly applicable for real-world use. Data Collection Method: Our datasets are gathered through a community-driven approach, leveraging our extensive network of over 700k users across various Telegram apps. This method ensures that the data is not only diverse but also ethically sourced with full consent from participants, providing reliable and real-world applicable data for training AI models. Versatility: This dataset is ideal for training models in face detection, anti-spoofing, and face recognition tasks, offering robust support for these essential computer vision applications. In addition to the Live Face Anti-Spoof Dataset, FileMarket provides specialized datasets across various categories to support a wide range of AI and machine learning projects:
Object Detection Data: Perfect for training AI in image and video analysis. Machine Learning (ML) Data: Offers a broad spectrum of applications, from predictive analytics to natural language processing (NLP). Large Language Model (LLM) Data: Designed to support text generation, chatbots, and machine translation models. Deep Learning (DL) Data: Essential for developing complex neural networks and deep learning models. Biometric Data: Includes diverse datasets for facial recognition, fingerprint analysis, and other biometric applications. This live face dataset, alongside our other specialized data categories, empowers your AI projects by providing high-quality, diverse, and comprehensive datasets. Whether your focus is on anti-spoofing detection, face recognition, or other biometric and machine learning tasks, our data offerings are tailored to meet your specific needs.
This dataset includes all detectable faces of the corresponding part of the full dataset. Kaggle and the host expected and encouraged us to train our models outside of Kaggle’s notebooks environment; however, for someone who prefers to stick to Kaggle's kernels, this dataset would help a lot 😄.
Can be used for a variety purpose, e.g. classification, etc.
Want something to start? Let check this demo 😉.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
A dataset fake faces scraped from here. Majorly biased over female faces in this current version.
Contains 6.4K images of fake faces - color and 1024x1024 This current version is heavily biased towards female faces but has a mix of other faces to help with GANs.
I'm grateful to the amazing work of the creators of StyleGan2 and everyone associated with it!
I hope GAN would evolve into a state where we could generate anything within seconds based on any classification. Imagine an MMORPG built based entirely on millions of player choices.
gagan3012/fake-news dataset hosted on Hugging Face and contributed by the HF Datasets community
saakshigupta/gradcam-fake-real-faces-xception-test dataset hosted on Hugging Face and contributed by the HF Datasets community
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Similar face recognition has always been one of the most challenging research directions in face recognition.This project shared similar face images (SFD.zip) that we have collected so far. All images are labeld and collected from publicly available datasets such as LFW, CASIA-WebFace.We will continue to collect larger-scale data and continue to update this project.Because the data set is too large, we uploaded a compressed zip file (SFD.zip). Meanwhile here we upload a few examples for everyone to view.email: ileven@shu.edu.cn
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
About Dataset This dataset contains real and fake images of human faces. Real and Fake Face Detection Fake Face Photos by Photoshop Experts Introduction When using social networks, have you ever encountered a 'fake identity'? Anyone can create a fake profile image using image editing tools, or even using deep learning based generators. If you are interested in making the world wide web a better place by recognizing such fake faces, you should check this dataset.