100+ datasets found
  1. k

    Synthetic-Faces-High-Quality--SFHQ--part-2

    • kaggle.com
    Updated Sep 6, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Synthetic-Faces-High-Quality--SFHQ--part-2 [Dataset]. https://www.kaggle.com/datasets/selfishgene/synthetic-faces-high-quality-sfhq-part-2
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 6, 2022
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Synthetic Faces High Quality (SFHQ) part 2

    This dataset consists of 91,361 high quality 1024x1024 curated face images, and was created by "bringing to life" various 3D models and correcting bad "text to image" generations from stable diffusion model using a process similar to what is described in this short twitter thread which involve encoding the images into StyleGAN2 latent space and performing a small manipulation that turns each image into a photo-realistic image.

    The dataset also contains facial landmarks (extended set) and face parsing semantic segmentation maps. An example script is provided and demonstrates how to access landmarks, segmentation maps, and textually search withing the dataset (with CLIP image/text feature vectors), and also performs some exploratory analysis of the dataset. link to github repo of the dataset.

    The process that "brings to life" face-like images and creates several candidate photo-realistic ones is illustrated below: https://i.ibb.co/0sw8TkL/bring-to-life-process-SD-FS-2.png" alt="">

    More Details

    1. The original inspiration images are taken from Face Synthetics Dataset which contains 3D models of faces and generated images using stable diffusion v1.4 model using various face portrait prompts that span a wide range of ethnicities, ages, expressions, hairstyles, etc. Note that stable diffusion faces often contain extreme error in the generation (as one can be seen in the three rightmost columns in the image above) so cannot be used to create a photo-reallistic dataset without a correcting model or an extremely lengthy manual curation process.
    2. Each inspiration image was encoded by encoder4editing (e4e) into StyleGAN2 latent space (StyleGAN2 is a generative face model tained on FFHQ dataset) and multiple candidate images were generated from each inspiration image
    3. These candidate images were then further curated and verified as being photo-realistic and high quality by a single human (me) and a machine learning assistant model that was trained to approximate my own human judgments and helped me scale myself to asses the quality of all images in the dataset
    4. Near duplicates and images that were too similar were removed using CLIP features (no two images in the dataset have CLIP similarity score of greater than ~0.92)
    5. From each image various pre-trained features were extracted and provided here for convenience, in particular CLIP features for fast textual query of the dataset
    6. From each image, semantic segmentation maps were extracted using Face Parsing BiSeNet and are provided in the dataset under "segmentations"
    7. From each image, an extended landmark set was extracted that also contain inner and outer hairlines (these are unique landmarks that are usually not extracted by other algorithms). These landmarks were extracted using Dlib, Face Alignment and some post processing of Face Parsing BiSeNet and are provided in the dataset under "landmarks"
    8. NOTE: semantic segmentation and landmarks were first calculated on scaled down version of 256x256 images, and then upscaled to 1024x1024

    Parts 1,2,3,4

    • Part 1 of the dataset consists of 89,785 HQ 1024x1024 curated face images. It uses "inspiration" images from Artstation-Artistic-face-HQ dataset (AAHQ), Close-Up Humans dataset and UIBVFED dataset.
    • Part 2 of the dataset consists of 91,361 HQ 1024x1024 curated face images. It uses "inspiration" images from Face Synthetics dataset and by sampling from the Stable Diffusion v1.4 text to image generator using varied face portrait prompts.
    • Part 3 of the dataset consists of 118,358 HQ 1024x1024 curated face images. It uses "inspiration" images by sampling from StyleGAN2 mapping network with very high truncation psi coefficients to increase diversity of the generation. Here, the e4e encoder is basically used a new kind of truncation trick.
    • Part 4 of the dataset consists of 125,754 HQ 1024x1024 curated face images. It uses "inspiration" images by sampling from the Stable Diffusion v2.1 text to image generator using varied face portrait prompts.
    • See also dataset github repo with full details and links

    Summary

    Overall, the SFHQ dataset contains ~425,000 high quality and curated synthetic face images that have no privacy issues or license issues surrounding them.

    This dataset contains a high degree of variability on the axes of identity, ethnicity, age, pose, expression, lighting conditions, hair-style, hair-color, facial hair. It lacks variability in accessories axes such as hats or earphones as well as various jewelry. It also doesn't contain any occlusions except the self-occlusion of hair occluding the forehead, the ears and rarely the eyes. This dataset naturally inherits all the biases of it's original datasets (FFHQ, AAHQ, Close-Up Humans, Face Synthetics, LAION-5B) and the StyleGAN2 and Stable Diffusion models.

    The purpose of this dataset is to be of sufficiently high quality that new machine learning models can be trained using this data, including even generative face models such as StyleGAN. The dataset may be extended from time to time with additional supervision labels (e.g. text descriptions), but no promises.

    Hope this is helpful to some of you, feel free to use as you see fit...

  2. d

    CMU Face Images

    • data.world
    zip
    Updated Jun 19, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UCI (2023). CMU Face Images [Dataset]. https://data.world/uci/cmu-face-images
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 19, 2023
    Dataset provided by
    data.world, Inc.
    Authors
    UCI
    Description

    Bblack and white face images of people taken with varying pose, expression, eyes, and sizeThis data consists of 640 black and white face images of people taken with varying pose (straight, left, right, up), expression (neutral, happy, sad, angry), eyes (wearing sunglasses or not), and size

    Source:

    Original Owner and Donor: Tom MitchellSchool of Computer Science Carnegie Mellon Universitytom.mitchell '@' cmu.edu http://www.cs.cmu.edu/~tom/

    Data Set Information:

    Each image can be characterized by the pose, expression, eyes, and size. There are 32 images for each person capturing every combination of features. To view the images, you can use the program xv. The image data can be found in /faces. This directory contains 20 subdirectories, one for each person, named by userid. Each of these directories contains several different face images of the same person. You will be interested in the images with the following naming convention: .pgm is the user id of the person in the image, and this field has 20 values: an2i, at33, boland, bpm, ch4f, cheyer, choon, danieln, glickman, karyadi, kawamura, kk49, megak, mitchell, night, phoebe, saavik, steffi, sz24, and tammo. is the head position of the person, and this field has 4 values: straight, left, right, up. is the facial expression of the person, and this field has 4 values: neutral, happy, sad, angry. is the eye state of the person, and this field has 2 values: open, sunglasses. is the scale of the image, and this field has 3 values: 1, 2, and 4. 1 indicates a full-resolution image (128 columns by 120 rows); 2 indicates a half-resolution image (64 by 60); 4 indicates a quarter-resolution image (32 by 30). If you've been looking closely in the image directories, you may notice that some images have a .bad suffix rather than the .pgm suffix. As it turns out, 16 of the 640 images taken have glitches due to problems with the camera setup; these are the .bad images. Some people had more glitches than others, but everyone who got ``faced'' should have at least 28 good face images (out of the 32 variations possible, discounting scale). More information and C code for loading the images is available here: .

    Attribute Information:

    N/A

    Relevant Papers:

    T. Mitchell. Machine Learning, McGraw Hill, 1997.

    Papers That Cite This Data Set1:

    Xiaofeng He and Partha Niyogi. Locality Preserving Projections. NIPS. 2003. * Marina Meila and Michael I. Jordan. Learning with Mixtures of Trees. Journal of Machine Learning Research, 1. 2000.

    Citation Request:

    You may use this material free of charge for any educational purpose, provided attribution is given in any lectures or publications that make use of this material. [1] Papers were automatically harvested and associated with this data set, in collaborationwith Rexa.info

    Source: http://archive.ics.uci.edu/ml/datasets/CMU+Face+Images

  3. d

    Frontal Face Images - Test Set A

    • data.world
    zip
    Updated Jun 9, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Machine Learning Research Data (2023). Frontal Face Images - Test Set A [Dataset]. https://data.world/ml-research/frontal-face-images-test-set-a
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 9, 2023
    Dataset provided by
    data.world, Inc.
    Authors
    Machine Learning Research Data
    Description

    The image dataset is used by the CMU Face Detection Project and is provided for evaluating algorithms for detecting frontal views of human faces. This particular test set was originally assembled as part of work in Neural Network Based Face Detection. It combines images collected at CMU and MIT.

    Please give appropriate acknowledgements when you use these test sets. In the lists of files below, you will find references to Test Sets A, B, C and the Rotated Test Set. Test Set B was provided by Kah-Kay Sung and Tomaso Poggio at the AI/CBCL Lab at MIT, and Test Sets A,C and the rotatated test set were collected here at CMU (by Henry A. Rowley, Shumeet Baluja, and Takeo Kanade).

    In [Schneiderman and Kanade, 2000] and [Schneiderman and Kanade, 1998] we refer to the combination of test sets A, B, and C as the "combined test sets of Sung and Poggio and Rowley, Baluja, and Kanade." In [Rowley, Baluja, and Kanade, 1998] we refer to the combination of sets A, B, C as "test set one" and in [Rowley, Baluja, and Kanade, 1997] we refer to it as the "upright set" as distinguished from the "rotated set."

    Additional Information

    We provide ground truth in face location in the following format with one line per face (extreme side views are ignored):

    filename left-eye right-eye nose left-corner-mouth center-mouth right-corner-mouth

    For each feature on a face to be detected, two numbers are given. These numbers are the x and y coordinates (measured from the upper left corner) of the feature in the image.

    Source: http://vasc.ri.cmu.edu/idb/images/face/frontal_images/images.html

  4. P

    ORL Dataset

    • paperswithcode.com
    Updated Feb 18, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ferdinand Samaria; Andy Harter (2021). ORL Dataset [Dataset]. https://paperswithcode.com/dataset/orl
    Explore at:
    Dataset updated
    Feb 18, 2021
    Authors
    Ferdinand Samaria; Andy Harter
    Description

    The ORL Database of Faces contains 400 images from 40 distinct subjects. For some subjects, the images were taken at different times, varying the lighting, facial expressions (open / closed eyes, smiling / not smiling) and facial details (glasses / no glasses). All the images were taken against a dark homogeneous background with the subjects in an upright, frontal position (with tolerance for some side movement). The size of each image is 92x112 pixels, with 256 grey levels per pixel.

    Download dataset from Kaggle: https://www.kaggle.com/datasets/kasikrit/att-database-of-faces

  5. a

    Flickr Faces HQ (FFHQ) 70K from StyleGAN

    • academictorrents.com
    bittorrent
    Updated Mar 8, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tero Karras and Samuli Laine and Timo Aila (2021). Flickr Faces HQ (FFHQ) 70K from StyleGAN [Dataset]. https://academictorrents.com/details/1c1e60f484e911b564de6b4d8b643e19154d5809
    Explore at:
    bittorrentAvailable download formats
    Dataset updated
    Mar 8, 2021
    Dataset authored and provided by
    Tero Karras and Samuli Laine and Timo Aila
    License

    https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified

    Description

    Flickr-Faces-HQ (FFHQ) is a high-quality image dataset of human faces, originally created as a benchmark for generative adversarial networks (GAN). The dataset consists of 70,000 high-quality PNG images at 1024x1024 resolution and contains considerable variation in terms of age, ethnicity and image background. It also has good coverage of accessories such as eyeglasses, sunglasses, hats, etc. The images were crawled from Flickr, thus inheriting all the biases of that website, and automatically aligned and cropped using dlib. Only images under permissive licenses were collected. Various automatic filters were used to prune the set, and finally Amazon Mechanical Turk was used to remove the occasional statues, paintings, or photos of photos.

  6. SJB Face Dataset: Indian Face Image Dataset with changes in Pose,...

    • ieee-dataport.org
    Updated Nov 16, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bhaskar Belavadi (2022). SJB Face Dataset: Indian Face Image Dataset with changes in Pose, Illumination,Expression and Occlusion [Dataset]. http://doi.org/10.21227/xm8n-7r78
    Explore at:
    Dataset updated
    Nov 16, 2022
    Dataset provided by
    Institute of Electrical and Electronics Engineershttp://www.ieee.ro/
    Authors
    Bhaskar Belavadi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Biometric management and that to which uses face, is indeed a very challenging work and requires a dedicated dataset which imbibes in it variations in pose, emotion and even occlusions. The Current work aims at delivering a dataset for training and testing purposes.SJB Face dataset is one such Indian face image dataset, which can be used to recognize faces. SJB Face dataset contains face images which were collected from digital camera. The face dataset collected has certain conditions such as different pose, Expressions, face partially occluded and with a uniform attire. SJB Face Dataset was collected from 48 students in which each of them consisted of 13 face images. All the images have in it the students in white attire. This database shall be used for face recognition projects in academia and industry to develop attendance systems and other relevant areas, as attendance system requires to have a systematic images for training.

  7. Real and Fake Face Detection

    • kaggle.com
    zip
    Updated Jan 14, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CIPLAB @ Yonsei University (2019). Real and Fake Face Detection [Dataset]. https://www.kaggle.com/datasets/ciplab/real-and-fake-face-detection
    Explore at:
    zip(452107760 bytes)Available download formats
    Dataset updated
    Jan 14, 2019
    Authors
    CIPLAB @ Yonsei University
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Real and Fake Face Detection

    Computational Intelligence and Photography Lab
    Department of Computer Science, Yonsei University

    https://github.com/minostauros/Real-and-Fake-Face-Detection/raw/master/samples.jpg" alt="Data Samples">

    Fake Face Photos by Photoshop Experts

    Introduction

    When using social networks, have you ever encountered a 'fake identity'? Anyone can create a fake profile image using image editing tools, or even using deep learning based generators. If you are interested in making the world wide web a better place by recognizing such fake faces, you should check this dataset.

    What's Inside and Why

    Our dataset contains expert-generated high-quality photoshopped face images. The images are composite of different faces, separated by eyes, nose, mouth, or whole face. You may wonder why we need these expensive images other than images automatically generated by computers. Say we want to train a classifier for real and fake face images. In case of generative models like Generative Adversarial Networks (GAN), it is very easy to generate fake face images. Then, a classifier can be trained using those images, and they do great job discriminating real and generated face images. We can easily assume that the classifier learns some kind of pattern between images generated by GANs. However, those patterns can be futile in front of human experts, since exquisite counterfeits by experts are created in completely different process. Thus we had to create our own dataset with expert level fake face photos.

    Directory and File Information

    Inside the parent directory, training_real/training_fake contains real/fake face photos, respectively. In case of fake photos, we have three groups; easy, mid, and hard (these groups are separated subjectively, so we do not recommend using them as explicit categories). Also, you can use the filenames of fake images to see which part of faces are replaced (refer to the image below). https://github.com/minostauros/Real-and-Fake-Face-Detection/raw/master/filename_description.jpg" alt="Filename description.">

    Citation

    You can cite our dataset as following. [Date Retrieved] should be updated by your own date. Seonghyeon Nam, Seoung Wug Oh, Jae Yeon Kang, Chang Ha Shin, Younghyun Jo, Young Hwi Kim, Kyungmin Kim, Minho Shim, Sungho Lee, Yunji Kim, Suho Han, Gunhee Nam, Dasol Lee, Subin Jeon, In Cho, Woongoh Cho, Sejong Yang, Dongyoung Kim, Hyolim Kang, Sukjun Hwang, and Seon Joo Kim. (2019, January). Real and Fake Face Detection, Version 1. Retrieved [Date Retrieved] from https://www.kaggle.com/datasets/ciplab/real-and-fake-face-detection.

  8. F

    Hispanic Occluded face Images - Biometric Identification Dataset

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Hispanic Occluded face Images - Biometric Identification Dataset [Dataset]. https://www.futurebeeai.com/dataset/image-dataset/facial-images-occlusion-hispanic
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/data-license-agreementhttps://www.futurebeeai.com/data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    This AI training dataset contains 500+ Human Face with Occlusion image datasets for the Face Recognition model. Each dataset contains 5 different images of individuals with various accessories like a mask, cap, sunglasses, and a combination of mask and sunglasses, along with a normal image without any accessories. A network comprising more than 500 individuals from Latin American nations including the Argentina, Brazil, Costa Rica, Ecuador, Colombia, Peru, and many more has been used to gather the occlusion image data. The participants in the collection are males and females from the age group of 18 to 70 years. All images were collected with different lighting conditions and backgrounds to keep the Biometric dataset diverse and unbiased. All photos were taken using the most recent mobile devices with high quality.,

    Along with Human face with Occlusion Image data, it also has metadata of participants like name, age, gender, country, and demographic of each participant to make it ready to use for computer vision technology. This human picture dataset for machine learning can be useful for teaching and assisting machines to recognise and identify Human face with occlusion of Latin American people. This training dataset can also be used to create models for KYC, biometric identity, and facial recognition, Occlusion Identification among other things.,

    We continuously add more assets of diverse conditions and requirements in this off-the-shelf Image dataset. In accordance with your unique AI demands, we can additionally gather more detailed facial data. You can explore our crowd community for custom facial data collection.,

    The license for this training dataset belongs to FutureBeeAI.

  9. o

    Face Image Meta-Database (fIMDb)

    • osf.io
    Updated Jun 28, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Clifford Ian Workman (2021). Face Image Meta-Database (fIMDb) [Dataset]. http://doi.org/10.17605/OSF.IO/DESZT
    Explore at:
    Dataset updated
    Jun 28, 2021
    Dataset provided by
    Center For Open Science
    Authors
    Clifford Ian Workman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    An index of face databases, their features, and how to access them has been unavailable. The “Face Image Meta-Database” (fIMDb) provides researchers with the tools to find the face images best suited to their research. The fIMDb is available from: https://cliffordworkman.com/resources/

  10. d

    23,110 People Multi-race and Multi-pose Face Images Data

    • datatang.ai
    • nexdata.ai
    Updated Jun 3, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Datatang (2022). 23,110 People Multi-race and Multi-pose Face Images Data [Dataset]. https://www.datatang.ai/datasets/1016
    Explore at:
    Dataset updated
    Jun 3, 2022
    Dataset provided by
    datatang technology inc
    Authors
    Datatang
    Variables measured
    Device, Accuracy, Data size, Data format, Data diversity, Age distribution, Race distribution, Gender distribution, Collecting environment
    Description

    23,110 People Multi-race and Multi-pose Face Images Data. This data includes Asian race, Caucasian race, black race, brown race and Indians. Each subject were collected 29 images under different scenes and light conditions. The 29 images include 28 photos (multi light conditions, multiple poses and multiple scenes) + 1 ID photo. This data can be used for face recognition related tasks.

  11. d

    Frontal Face Images - Rotated Test Set

    • data.world
    zip
    Updated Mar 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Machine Learning Research Data (2024). Frontal Face Images - Rotated Test Set [Dataset]. https://data.world/ml-research/frontal-face-images-rotated-test-set
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 14, 2024
    Dataset provided by
    data.world, Inc.
    Authors
    Machine Learning Research Data
    Description

    The image dataset is used by the CMU Face Detection Project and is provided for evaluating algorithms for detecting frontal views of human faces. This particular test set was originally assembled as part of work in Neural Network Based Face Detection. It combines images collected at CMU and MIT.

    Please give appropriate acknowledgements when you use these test sets. In the lists of files below, you will find references to Test Sets A, B, C and the Rotated Test Set. Test Set B was provided by Kah-Kay Sung and Tomaso Poggio at the AI/CBCL Lab at MIT, and Test Sets A,C and the rotatated test set were collected here at CMU (by Henry A. Rowley, Shumeet Baluja, and Takeo Kanade).

    In [Schneiderman and Kanade, 2000] and [Schneiderman and Kanade, 1998] we refer to the combination of test sets A, B, and C as the "combined test sets of Sung and Poggio and Rowley, Baluja, and Kanade." In [Rowley, Baluja, and Kanade, 1998] we refer to the combination of sets A, B, C as "test set one" and in [Rowley, Baluja, and Kanade, 1997] we refer to it as the "upright set" as distinguished from the "rotated set."

    Additional Information

    We provide ground truth in face location in the following format with one line per face (extreme side views are ignored):

    filename left-eye right-eye nose left-corner-mouth center-mouth right-corner-mouth

    For each feature on a face to be detected, two numbers are given. These numbers are the x and y coordinates (measured from the upper left corner) of the feature in the image.

    Source: http://vasc.ri.cmu.edu/idb/images/face/frontal_images/images.html

  12. m

    Dataset for Smile Detection from Face Images

    • data.mendeley.com
    Updated Jan 24, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Olasimbo Arigbabu (2017). Dataset for Smile Detection from Face Images [Dataset]. http://doi.org/10.17632/yz4v8tb3tp.5
    Explore at:
    Dataset updated
    Jan 24, 2017
    Authors
    Olasimbo Arigbabu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data is used in the second experimental evaluation of face smile detection in the paper titled "Smile detection using Hybrid Face Representaion" - O.A.Arigbabu et al. 2015.

    Download the main images from LFWcrop website: http://conradsanderson.id.au/lfwcrop/ to select the samples we used for smile and non-smile, as in the list.

    Kindly cite:

    Arigbabu, Olasimbo Ayodeji, et al. "Smile detection using hybrid face representation." Journal of Ambient Intelligence and Humanized Computing (2016): 1-12.

    C. Sanderson, B.C. Lovell. Multi-Region Probabilistic Histograms for Robust and Scalable Identity Inference. ICB 2009, LNCS 5558, pp. 199-208, 2009

    Huang GB, Mattar M, Berg T, Learned-Miller E (2007) Labeled faces in the wild: a database for studying face recognition in unconstrained environments. University of Massachusetts, Amherst, Technical Report

  13. P

    FFHQ Dataset

    • paperswithcode.com
    Updated Dec 11, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tero Karras; Samuli Laine; Timo Aila (2018). FFHQ Dataset [Dataset]. https://paperswithcode.com/dataset/ffhq
    Explore at:
    Dataset updated
    Dec 11, 2018
    Authors
    Tero Karras; Samuli Laine; Timo Aila
    Description

    Flickr-Faces-HQ (FFHQ) consists of 70,000 high-quality PNG images at 1024×1024 resolution and contains considerable variation in terms of age, ethnicity and image background. It also has good coverage of accessories such as eyeglasses, sunglasses, hats, etc. The images were crawled from Flickr, thus inheriting all the biases of that website, and automatically aligned and cropped using dlib. Only images under permissive licenses were collected. Various automatic filters were used to prune the set, and finally Amazon Mechanical Turk was used to remove the occasional statues, paintings, or photos of photos.

  14. F

    African Historical Images - Biometric Image Dataset

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). African Historical Images - Biometric Image Dataset [Dataset]. https://www.futurebeeai.com/dataset/image-dataset/facial-images-historical-african
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/data-license-agreementhttps://www.futurebeeai.com/data-license-agreement

    Area covered
    Africa
    Dataset funded by
    FutureBeeAI
    Description

    This AI training dataset contains 500+ human historical image datasets for the face recognition model. Each dataset contains 22 different historical images and 1 enrollment image of a human face. A network comprising more than 500 individuals from African nations including Kenya, Malawi, Nigeria, Benin, Ethiopia, and many more has been used to gather the historical image data. The participants in the collection are males and females from the age group of 18 to 70 years. All images were collected with different lighting conditions and backgrounds to keep the Biometric dataset diverse and unbiased. All photos are of high quality.,

    Along with past Image data, it also has metadata of participants like name, age, gender, country, and demographic of each participant to make it ready to use for computer vision technology. This human picture dataset for machine learning can be useful for teaching and assisting machines to recognise and identify African people's faces. This training dataset can also be used to create models for KYC, biometric identity, and facial recognition, among other things.,

    We continuously add more assets of diverse conditions and requirements in this off-the-shelf Image dataset. In accordance with your unique AI demands, we can additionally gather more detailed facial data. You can explore our crowd community for custom facial data collection.,

    The license for this training dataset belongs to FutureBeeAI.

  15. a

    Georgia Tech face database

    • academictorrents.com
    bittorrent
    Updated Oct 30, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ara V. Nefian (2015). Georgia Tech face database [Dataset]. https://academictorrents.com/details/0848b2c9b40e49041eff85ac4a2da71ae13a3e4f
    Explore at:
    bittorrentAvailable download formats
    Dataset updated
    Oct 30, 2015
    Dataset authored and provided by
    Ara V. Nefian
    License

    https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified

    Description

    Georgia Tech face database (128MB) contains images of 50 people taken in two or three sessions between 06/01/99 and 11/15/99 at the Center for Signal and Image Processing at Georgia Institute of Technology. All people in the database are represented by 15 color JPEG images with cluttered background taken at resolution 640x480 pixels. The average size of the faces in these images is 150x150 pixels. The pictures show frontal and/or tilted faces with different facial expressions, lighting conditions and scale. Each image is manually labeled to determine the position of the face in the image. The set of label files is available here. The Readme.txt file gives more details about the database.

  16. Balanced Faces in the Wild

    • ieee-dataport.org
    Updated Oct 11, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joseph Robinson (2022). Balanced Faces in the Wild [Dataset]. http://doi.org/10.21227/nmsj-df12
    Explore at:
    Dataset updated
    Oct 11, 2022
    Dataset provided by
    Institute of Electrical and Electronics Engineershttp://www.ieee.ro/
    Authors
    Joseph Robinson
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This project investigates bias in automatic facial recognition (FR). Specifically, subjects are grouped into predefined subgroups based on gender, ethnicity, and age. We propose a novel image collection called Balanced Faces in the Wild (BFW), which is balanced across eight subgroups (i.e., 800 face images of 100 subjects, each with 25 face samples). Thus, along with the name (i.e., identification) labels and task protocols (e.g., list of pairs for face verification, pre-packaged data-table with additional metadata and labels, etc.), BFW groups into ethnicities (i.e., Asian (A), Black (B), Indian (I), and White (W)) and genders (i.e., Females (F) and Males (M)). Thus, the motivation and intent are that BFW will provide a proxy to characterize FR systems with demographic-specific analysis now possible. For instance, various confusion metrics and the predefined criteria (i.e., score threshold) are fundamental when characterizing performance ratings of FR systems. The following visualization summarizes the confusion metrics in a way that relates to the different measurements.

  17. P

    Face dataset by Generated Photos Dataset

    • paperswithcode.com
    Updated Jun 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Face dataset by Generated Photos Dataset [Dataset]. https://paperswithcode.com/dataset/face-dataset-by-generated-photos
    Explore at:
    Dataset updated
    Jun 22, 2023
    Description

    The free Face dataset made for students and teachers. It contains 10,000 photos with equal distribution of race and gender parameters, along with metadata and facial landmarks. Free to use for research with citation Photos by Generated.Photos.

    Photos

    All the photos are 100% synthetic. Based on model-released photos. Royalty-free. Can be used for any research purpose except for the ones violating the law. Worldwide. No time limitations. Quantity 10,000 Quality 256x256px Diversity Ethnicity, gender

    Metadata

    The JSON files contain the metadata for each image in a machine-readable format, including: (1) FaceLandmarks: mouth, right_eyebrow, left_eyebrow, right_eye, left_eye, nose, jaw. (2) FaceAttributes: headPose, gender, makeup, emotion, facialHair, hair (hairColor, hairLength, bald), occlusion, ethnicity, eye_color, smile, age

  18. f

    Similar Face Dataset (SFD)

    • figshare.com
    zip
    Updated Jan 15, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AnPing Song (2020). Similar Face Dataset (SFD) [Dataset]. http://doi.org/10.6084/m9.figshare.11611071.v3
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 15, 2020
    Dataset provided by
    figshare
    Authors
    AnPing Song
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Similar face recognition has always been one of the most challenging research directions in face recognition.This project shared similar face images (SFD.zip) that we have collected so far. All images are labeld and collected from publicly available datasets such as LFW, CASIA-WebFace.We will continue to collect larger-scale data and continue to update this project.Because the data set is too large, we uploaded a compressed zip file (SFD.zip). Meanwhile here we upload a few examples for everyone to view.email: ileven@shu.edu.cn

  19. f

    Facial Expression Image Dataset for Computer Vision Algorithms

    • salford.figshare.com
    zip
    Updated Oct 18, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ali Alameer; Odunmolorun Osonuga (2022). Facial Expression Image Dataset for Computer Vision Algorithms [Dataset]. http://doi.org/10.17866/rd.salford.21220835.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 18, 2022
    Dataset provided by
    University of Salford
    Authors
    Ali Alameer; Odunmolorun Osonuga
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset for this project is characterised by photos of individual human emotion expression and these photos are taken with the help of both digital camera and a mobile phone camera from different angles, posture, background, light exposure, and distances. This task might look and sound very easy but there were some challenges encountered along the process which are reviewed below: 1) People constraint One of the major challenges faced during this project is getting people to participate in the image capturing process as school was on vacation, and other individuals gotten around the environment were not willing to let their images be captured for personal and security reasons even after explaining the notion behind the project which is mainly for academic research purposes. Due to this challenge, we resorted to capturing the images of the researcher and just a few other willing individuals. 2) Time constraint As with all deep learning projects, the more data available the more accuracy and less error the result will produce. At the initial stage of the project, it was agreed to have 10 emotional expression photos each of at least 50 persons and we can increase the number of photos for more accurate results but due to the constraint in time of this project an agreement was later made to just capture the researcher and a few other people that are willing and available. These photos were taken for just two types of human emotion expression that is, “happy” and “sad” faces due to time constraint too. To expand our work further on this project (as future works and recommendations), photos of other facial expression such as anger, contempt, disgust, fright, and surprise can be included if time permits. 3) The approved facial emotions capture. It was agreed to capture as many angles and posture of just two facial emotions for this project with at least 10 images emotional expression per individual, but due to time and people constraints few persons were captured with as many postures as possible for this project which is stated below: Ø Happy faces: 65 images Ø Sad faces: 62 images There are many other types of facial emotions and again to expand our project in the future, we can include all the other types of the facial emotions if time permits, and people are readily available. 4) Expand Further. This project can be improved furthermore with so many abilities, again due to the limitation of time given to this project, these improvements can be implemented later as future works. In simple words, this project is to detect/predict real-time human emotion which involves creating a model that can detect the percentage confidence of any happy or sad facial image. The higher the percentage confidence the more accurate the facial fed into the model. 5) Other Questions Can the model be reproducible? the supposed response to this question should be YES. If and only if the model will be fed with the proper data (images) such as images of other types of emotional expression.

  20. f

    Large-scale Labeled Faces (LSLF) Dataset.zip

    • figshare.com
    • commons.datacite.org
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tarik Alafif; Zeyad Hailat; Melih Aslan; Xuewen Chen (2023). Large-scale Labeled Faces (LSLF) Dataset.zip [Dataset]. http://doi.org/10.6084/m9.figshare.13077329.v1
    Explore at:
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    figshare
    Authors
    Tarik Alafif; Zeyad Hailat; Melih Aslan; Xuewen Chen
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Our LSLF dataset consists of 1,195,976 labeled face images for 11,459 individuals. These images are stored in JPEG format with a total size of 5.36 GB. Individuals have a minimum of 1 face image and a maximum of 1,157 face images. The average number of face images per individual is 104. Each image is automatically named as (PersonName VideoNumber FrameNumber ImageNuumber) and stored in the related individual folder.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2022). Synthetic-Faces-High-Quality--SFHQ--part-2 [Dataset]. https://www.kaggle.com/datasets/selfishgene/synthetic-faces-high-quality-sfhq-part-2

Synthetic-Faces-High-Quality--SFHQ--part-2

91K curated 1024x1024 face images. StyleGAN2 encoding of 3D models SD1.4 images

Related Article
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 6, 2022
License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

Synthetic Faces High Quality (SFHQ) part 2

This dataset consists of 91,361 high quality 1024x1024 curated face images, and was created by "bringing to life" various 3D models and correcting bad "text to image" generations from stable diffusion model using a process similar to what is described in this short twitter thread which involve encoding the images into StyleGAN2 latent space and performing a small manipulation that turns each image into a photo-realistic image.

The dataset also contains facial landmarks (extended set) and face parsing semantic segmentation maps. An example script is provided and demonstrates how to access landmarks, segmentation maps, and textually search withing the dataset (with CLIP image/text feature vectors), and also performs some exploratory analysis of the dataset. link to github repo of the dataset.

The process that "brings to life" face-like images and creates several candidate photo-realistic ones is illustrated below: https://i.ibb.co/0sw8TkL/bring-to-life-process-SD-FS-2.png" alt="">

More Details

  1. The original inspiration images are taken from Face Synthetics Dataset which contains 3D models of faces and generated images using stable diffusion v1.4 model using various face portrait prompts that span a wide range of ethnicities, ages, expressions, hairstyles, etc. Note that stable diffusion faces often contain extreme error in the generation (as one can be seen in the three rightmost columns in the image above) so cannot be used to create a photo-reallistic dataset without a correcting model or an extremely lengthy manual curation process.
  2. Each inspiration image was encoded by encoder4editing (e4e) into StyleGAN2 latent space (StyleGAN2 is a generative face model tained on FFHQ dataset) and multiple candidate images were generated from each inspiration image
  3. These candidate images were then further curated and verified as being photo-realistic and high quality by a single human (me) and a machine learning assistant model that was trained to approximate my own human judgments and helped me scale myself to asses the quality of all images in the dataset
  4. Near duplicates and images that were too similar were removed using CLIP features (no two images in the dataset have CLIP similarity score of greater than ~0.92)
  5. From each image various pre-trained features were extracted and provided here for convenience, in particular CLIP features for fast textual query of the dataset
  6. From each image, semantic segmentation maps were extracted using Face Parsing BiSeNet and are provided in the dataset under "segmentations"
  7. From each image, an extended landmark set was extracted that also contain inner and outer hairlines (these are unique landmarks that are usually not extracted by other algorithms). These landmarks were extracted using Dlib, Face Alignment and some post processing of Face Parsing BiSeNet and are provided in the dataset under "landmarks"
  8. NOTE: semantic segmentation and landmarks were first calculated on scaled down version of 256x256 images, and then upscaled to 1024x1024

Parts 1,2,3,4

  • Part 1 of the dataset consists of 89,785 HQ 1024x1024 curated face images. It uses "inspiration" images from Artstation-Artistic-face-HQ dataset (AAHQ), Close-Up Humans dataset and UIBVFED dataset.
  • Part 2 of the dataset consists of 91,361 HQ 1024x1024 curated face images. It uses "inspiration" images from Face Synthetics dataset and by sampling from the Stable Diffusion v1.4 text to image generator using varied face portrait prompts.
  • Part 3 of the dataset consists of 118,358 HQ 1024x1024 curated face images. It uses "inspiration" images by sampling from StyleGAN2 mapping network with very high truncation psi coefficients to increase diversity of the generation. Here, the e4e encoder is basically used a new kind of truncation trick.
  • Part 4 of the dataset consists of 125,754 HQ 1024x1024 curated face images. It uses "inspiration" images by sampling from the Stable Diffusion v2.1 text to image generator using varied face portrait prompts.
  • See also dataset github repo with full details and links

Summary

Overall, the SFHQ dataset contains ~425,000 high quality and curated synthetic face images that have no privacy issues or license issues surrounding them.

This dataset contains a high degree of variability on the axes of identity, ethnicity, age, pose, expression, lighting conditions, hair-style, hair-color, facial hair. It lacks variability in accessories axes such as hats or earphones as well as various jewelry. It also doesn't contain any occlusions except the self-occlusion of hair occluding the forehead, the ears and rarely the eyes. This dataset naturally inherits all the biases of it's original datasets (FFHQ, AAHQ, Close-Up Humans, Face Synthetics, LAION-5B) and the StyleGAN2 and Stable Diffusion models.

The purpose of this dataset is to be of sufficiently high quality that new machine learning models can be trained using this data, including even generative face models such as StyleGAN. The dataset may be extended from time to time with additional supervision labels (e.g. text descriptions), but no promises.

Hope this is helpful to some of you, feel free to use as you see fit...

Search
Clear search
Close search
Google apps
Main menu