95 datasets found

O
LFW (Labeled Faces in the Wild)
opendatalab.com
zip
Updated Jul 31, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
University of Massachusetts (2022). LFW (Labeled Faces in the Wild) [Dataset]. https://opendatalab.com/OpenDataLab/LFW
Explore at:
zip(8640421963 bytes)Available download formats
Dataset updated
Jul 31, 2022
Dataset provided by
University of Massachusetts
Description
Labeled Faces in the Wild, is a database of face photographs designed for studying the problem of unconstrained face recognition. The data set contains more than 13,000 images of faces collected from the web. Each face has been labeled with the name of the person pictured. 1680 of the people pictured have two or more distinct photos in the data set. The only constraint on these faces is that they were detected by the Viola-Jones face detector. More details can be found in the technical report below.
a
Labeled Faces in the Wild aligned (LFW-a)
academictorrents.com
bittorrent
Updated Nov 26, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yaniv Taigman and Lior Wolf and Tal Hassner (2015). Labeled Faces in the Wild aligned (LFW-a) [Dataset]. https://academictorrents.com/details/403e6d6945a64dd1b9e185a6cd8d029274efccdc
Explore at:
bittorrent(96770694)Available download formats
Dataset updated
Nov 26, 2015
Dataset authored and provided by
Yaniv Taigman and Lior Wolf and Tal Hassner
License
https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
Description
The "Labeled Faces in the Wild-a" image collection is a database of labeled, face images intended for studying Face Recognition in unconstrained images. It contains the same images available in the original Labeled Faces in the Wild data set, however, here we provide them after alignment using a commercial face alignment software. Some of our results, published in [1,2,3], were produced using these images. We show this alignment to improve the performance of face recognition algorithms. More information on how these images were aligned may be found in the two papers. We have maintained the same directory structure as in the original LFW data set, and so these images can be used as direct substitutes for those in the original image set. Note, however, that the images available here are grayscale versions of the originals. Citation: If you find these images useful and use them in your work, please follow these guidlines: Comply with any instructions specified for the original L
g
Face Detection Dataset
gts.ai
kaggle.com
json
Updated Oct 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GTS (2024). Face Detection Dataset [Dataset]. https://gts.ai/dataset-download/page/79/
Explore at:
jsonAvailable download formats
Dataset updated
Oct 15, 2024
Dataset provided by
GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
Authors
GTS
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The dataset comprises 16.7k images and 2 annotation files, each in a distinct format. The first file labeled Label contains annotations with the original scale.
T
lfw
tensorflow.org
Updated Mar 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). lfw [Dataset]. https://www.tensorflow.org/datasets/catalog/lfw
Explore at:
Dataset updated
Mar 14, 2025
Description
Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('lfw', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/lfw-0.1.1.png" alt="Visualization" width="500px">
R
Face Label Dataset
universe.roboflow.com
zip
Updated Jul 20, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
test (2025). Face Label Dataset [Dataset]. https://universe.roboflow.com/test-as42n/face-label-0iqte
Explore at:
zipAvailable download formats
Dataset updated
Jul 20, 2025
Dataset authored and provided by
test
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Faces Bounding Boxes
Description
Face Label

## Overview Face Label is a dataset for object detection tasks - it contains Faces annotations for 3,226 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Happy Face Dataset
kaggle.com
Updated Aug 26, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ashish Motwani (2022). Happy Face Dataset [Dataset]. https://www.kaggle.com/datasets/ashishmotwani/happyface
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 26, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Ashish Motwani
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Hello everyone , this is a dataset I am sharing , contains Happy and Non-Happy facial expressions to practice binary classification It contains labelled images of happy facial expression . I found this dataset while learning on coursera and I'd like to acknowledge them as the primary owner of the dataset
g
LFW – People (Face Recognition)
gts.ai
json
Updated Dec 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GTS (2023). LFW – People (Face Recognition) [Dataset]. https://gts.ai/dataset-download/lfw-people-face-recognition-dataset-ai-data-collection/
Explore at:
jsonAvailable download formats
Dataset updated
Dec 3, 2023
Dataset provided by
GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
Authors
GTS
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The LFW (Labeled Faces in the Wild) dataset is a popular benchmark dataset in the field of face recognition. It is used for evaluating and training face recognition algorithms and models.
O
MALF (Multi-Attribute Labelled Faces)
opendatalab.com
zip
Updated May 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Institute of Automation, Chinese Academy of Sciences (2023). MALF (Multi-Attribute Labelled Faces) [Dataset]. https://opendatalab.com/OpenDataLab/MALF
Explore at:
zipAvailable download formats
Dataset updated
May 1, 2023
Dataset provided by
Chinese Academy of Sciences
Institute of Automation, Chinese Academy of Sciences
License
http://www.cbsr.ia.ac.cn/faceevaluation/user_agreement.pdfhttp://www.cbsr.ia.ac.cn/faceevaluation/user_agreement.pdf
Description
The dataset contains 5,250 images with 11,931 annotated faces collected from the Internet. Each face contains the following annotations: square bounding box; pose deformation level of yaw, pitch and roll (small, medium, large); 'ignore' flag for faces which are smaller than 20x20 or extremely difficult to recognize (totally 838 faces, account for ~7%); other facial attributes: gender(female, male, unknown), isWearingGlasses, isOccluded and isExaggeratedExpression.
f
Large-scale Labeled Faces (LSLF) Dataset.zip
figshare.com
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tarik Alafif; Zeyad Hailat; Melih Aslan; Xuewen Chen (2023). Large-scale Labeled Faces (LSLF) Dataset.zip [Dataset]. http://doi.org/10.6084/m9.figshare.13077329.v1
Explore at:
Unique identifier
https://doi.org/10.6084/m9.figshare.13077329.v1
Dataset updated
Jun 1, 2023
Dataset provided by
figshare
Authors
Tarik Alafif; Zeyad Hailat; Melih Aslan; Xuewen Chen
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Our LSLF dataset consists of 1,195,976 labeled face images for 11,459 individuals. These images are stored in JPEG format with a total size of 5.36 GB. Individuals have a minimum of 1 face image and a maximum of 1,157 face images. The average number of face images per individual is 104. Each image is automatically named as (PersonName VideoNumber FrameNumber ImageNuumber) and stored in the related individual folder.
Gender Detection & Classification - Face Dataset
kaggle.com
Updated Oct 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Training Data (2023). Gender Detection & Classification - Face Dataset [Dataset]. https://www.kaggle.com/datasets/trainingdatapro/gender-detection-and-classification-image-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 31, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Training Data
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
Gender Detection & Classification - face recognition dataset

The dataset is created on the basis of Face Mask Detection dataset

Dataset Description:

The dataset comprises a collection of photos of people, organized into folders labeled "women" and "men." Each folder contains a significant number of images to facilitate training and testing of gender detection algorithms or models.

The dataset contains a variety of images capturing female and male individuals from diverse backgrounds, age groups, and ethnicities.

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2F1c4708f0b856f7889e3c0eea434fe8e2%2FFrame%2045%20(1).png?generation=1698764294000412&alt=media" alt="">

This labeled dataset can be utilized as training data for machine learning models, computer vision applications, and gender detection algorithms.

💴 For Commercial Usage: Full version of the dataset includes 376 000+ photos of people, leave a request on TrainingData to buy the dataset

Metadata for the full dataset:

assignment_id - unique identifier of the media file

worker_id - unique identifier of the person

age - age of the person

true_gender - gender of the person

country - country of the person

ethnicity - ethnicity of the person

photo_1_extension, photo_2_extension, photo_3_extension, photo_4_extension - photo extensions in the dataset

photo_1_resolution, photo_2_resolution, photo_3_extension, photo_4_resolution - photo resolution in the dataset

OTHER BIOMETRIC DATASETS:

Anti Spoofing Real Dataset

Antispoofing Replay Dataset

Selfies, ID Images dataset (5591 sets of 15 files)

Selfies and video dataset (4 052 sets)

Dataset of bald people, 5000 images

💴 Buy the Dataset: This is just an example of the data. Leave a request on https://trainingdata.pro/datasets to learn about the price and buy the dataset

Content

The dataset is split into train and test folders, each folder includes: - folders women and men - folders with images of people with the corresponding gender, - .csv file - contains information about the images and people in the dataset

File with the extension .csv

file: link to access the file,

gender: gender of a person in the photo (woman/man),

split: classification on train and test

TrainingData provides high-quality data annotation tailored to your needs

keywords: biometric system, biometric system attacks, biometric dataset, face recognition database, face recognition dataset, face detection dataset, facial analysis, gender detection, supervised learning dataset, gender classification dataset, gender recognition dataset
b
BioID Face Database
bioid.com
Updated Nov 15, 2006
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BioID (2006). BioID Face Database [Dataset]. https://www.bioid.com/face-database/
Explore at:
text/csv+zip, text//x-portable-graymap+zipAvailable download formats
Dataset updated
Nov 15, 2006
Dataset authored and provided by
BioID
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Pixel
Description
The BioID Face Database has been recorded and is published to give all researchers working in the area of face detection the possibility to compare the quality of their face detection algorithms with others. During the recording special emphasis has been laid on real world conditions. Therefore the testset features a large variety of illumination, background and face size. The dataset consists of 1521 gray level images with a resolution of 384x286 pixel. Each one shows the frontal view of a face of one out of 23 different test persons. For comparison reasons the set also contains manually set eye postions. The images are labeled BioID_xxxx.pgm where the characters xxxx are replaced by the index of the current image (with leading zeros). Similar to this, the files BioID_xxxx.eye contain the eye positions for the corresponding images.
m
Dataset for Smile Detection from Face Images
data.mendeley.com
Updated Jan 24, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Olasimbo Arigbabu (2017). Dataset for Smile Detection from Face Images [Dataset]. http://doi.org/10.17632/yz4v8tb3tp.5
Explore at:
Unique identifier
https://doi.org/10.17632/yz4v8tb3tp.5
Dataset updated
Jan 24, 2017
Authors
Olasimbo Arigbabu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This data is used in the second experimental evaluation of face smile detection in the paper titled "Smile detection using Hybrid Face Representaion" - O.A.Arigbabu et al. 2015.

Download the main images from LFWcrop website: http://conradsanderson.id.au/lfwcrop/ to select the samples we used for smile and non-smile, as in the list.

Kindly cite:

Arigbabu, Olasimbo Ayodeji, et al. "Smile detection using hybrid face representation." Journal of Ambient Intelligence and Humanized Computing (2016): 1-12.

C. Sanderson, B.C. Lovell. Multi-Region Probabilistic Histograms for Robust and Scalable Identity Inference. ICB 2009, LNCS 5558, pp. 199-208, 2009

Huang GB, Mattar M, Berg T, Learned-Miller E (2007) Labeled faces in the wild: a database for studying face recognition in unconstrained environments. University of Massachusetts, Amherst, Technical Report
h
wider_face
huggingface.co
opendatalab.com
+3more
Updated Jan 13, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Chinese University of Hong Kong (2022). wider_face [Dataset]. https://huggingface.co/datasets/CUHK-CSE/wider_face
Explore at:
Dataset updated
Jan 13, 2022
Dataset authored and provided by
The Chinese University of Hong Kong
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
WIDER FACE dataset is a face detection benchmark dataset, of which images are selected from the publicly available WIDER dataset. We choose 32,203 images and label 393,703 faces with a high degree of variability in scale, pose and occlusion as depicted in the sample images. WIDER FACE dataset is organized based on 61 event classes. For each event class, we randomly select 40%/10%/50% data as training, validation and testing sets. We adopt the same evaluation metric employed in the PASCAL VOC dataset. Similar to MALF and Caltech datasets, we do not release bounding box ground truth for the test images. Users are required to submit final prediction files, which we shall proceed to evaluate.
Labelled Faces in the Wild with cropped faces
kaggle.com
zip
Updated Oct 29, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jonathan Loscalzo (2019). Labelled Faces in the Wild with cropped faces [Dataset]. https://www.kaggle.com/jonathanloscalzo/lfw-cropped-faces
Explore at:
zip(73156947 bytes)Available download formats
Dataset updated
Oct 29, 2019
Authors
Jonathan Loscalzo
Description
Context

We need a dataset with cropped faces to evaluate face verification algorithms. Because of that, I developed a python script to detect and crop all images from LFW ( http://vis-www.cs.umass.edu/lfw/index.html )

ONLY FOR DEVELOPMENT PURPOSE

Content

There are folders by people, some contains many faces, but others only contains one. Exists 13137 in the dataset

Inspiration

It's easy to get this faces, but could be (to me) a manner to save them and use it many times for different algorithms.
a
Labeled Faces in the Wild - aligned with funneling
academictorrents.com
bittorrent
Updated Nov 26, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gary B. Huang and Vidit Jain and Erik Learned-Miller (2015). Labeled Faces in the Wild - aligned with funneling [Dataset]. https://academictorrents.com/details/073ecac13cf175ddc5617c1f2897b15ca7accd59
Explore at:
bittorrent(243346528)Available download formats
Dataset updated
Nov 26, 2015
Dataset authored and provided by
Gary B. Huang and Vidit Jain and Erik Learned-Miller
License
https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
Description
Data from the paper: "Unsupervised Joint Alignment of Complex Images" Gary B. Huang and Vidit Jain and Erik Learned-Miller ICCV 2007 Welcome to Labeled Faces in the Wild, a database of face photographs designed for studying the problem of unconstrained face recognition. The data set contains more than 13,000 images of faces collected from the web. Each face has been labeled with the name of the person pictured. 1680 of the people pictured have two or more distinct photos in the data set. The only constraint on these faces is that they were detected by the Viola-Jones face detector. More details can be found in the technical report below. Information: 13233 images 5749 people 1680 people with two or more images
a
Labeled Faces in the Wild - aligned with deep funneling
academictorrents.com
bittorrent
Updated Nov 26, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gary B. Huang and Marwan Mattar and Honglak Lee and Erik Learned-Miller (2015). Labeled Faces in the Wild - aligned with deep funneling [Dataset]. https://academictorrents.com/details/692d556e6f2fcb430adeffc464eb8b0a6da58f65
Explore at:
bittorrent(108761145)Available download formats
Dataset updated
Nov 26, 2015
Dataset authored and provided by
Gary B. Huang and Marwan Mattar and Honglak Lee and Erik Learned-Miller
License
https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
Description
Data from the paper: Learning to Align from Scratch Gary B. Huang and Marwan Mattar and Honglak Lee and Erik Learned-Miller NIPS 2012 Welcome to Labeled Faces in the Wild, a database of face photographs designed for studying the problem of unconstrained face recognition. The data set contains more than 13,000 images of faces collected from the web. Each face has been labeled with the name of the person pictured. 1680 of the people pictured have two or more distinct photos in the data set. The only constraint on these faces is that they were detected by the Viola-Jones face detector. More details can be found in the technical report below. Information: 13233 images 5749 people 1680 people with two or more images
H
Dataset: Faces extracted from Time Magazine 1923-2014
dataverse.harvard.edu
marketplace.sshopencloud.eu
tsv
Updated Mar 18, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Harvard Dataverse (2020). Dataset: Faces extracted from Time Magazine 1923-2014 [Dataset]. http://doi.org/10.7910/DVN/JMFQT7
Explore at:
tsv(800696), tsv(37599779), tsv(52072), tsv(2057025)Available download formats
Unique identifier
https://doi.org/10.7910/DVN/JMFQT7
Dataset updated
Mar 18, 2020
Dataset provided by
Harvard Dataverse
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The data presented here consists of three parts: Dataset 1: In this set, we extract 327,322 faces from our entire collection of 3389 issues, and automatically classified each face as male or female. We present this data as a single table with columns identifying the date, issue, page number, the coordinates identifying the position of the face on the page, and classification (male or female). The coordinates identifying the position of the face on the page are based on the size and resolution of the pages found in the “Time Vault”. Dataset 2: Dataset 2 consists of 8,789 classified faces from 100 selected issues. Human labor was used to identify and extract 3,299 face images from 39 issues, which were later classified by another set of workers. This selection of 39 issues contains one issue per decade spanned by the archive plus one issue per year between 1961 and 1991, and the extracted face images were used to train the face extraction algorithm. The remaining 5,490 faces from 61 issues were extracted via machine learning before being classified by human coders. These 61 issues were chosen to complement the first selection of 39 issues: one issue per year for all years in the archive excluding those between 1961 and 1991. Thus, Dataset 2 contains fully-labelled faces from at least one issue per year. Dataset 3: In the interest of transparency, Dataset 3 consists of the raw data collected to create Dataset 2, and consists of 2 tables. Before explaining these tables we first briefly describe our data collection and verification procedures, which have been fully described elsewhere. A custom AMT interface was used to enable human labors to classify faces according the categories in Table 4. Each worker was given a randomly-selected batch of 25 pages, each with a clearly highlighted face to be categorized, of which three pages were verification pages with known features, which were used for quality control. Each face was labeled by two distinct human coders, determined at random so that the paring of coders varied with the image. A proficiency rating was calculated for each coder by considering all images they annotated and computing the average number of labels that matched those identified by the image’s other coder. The tables in Dataset 2 were created by resolving inconsistencies between the two image coders by selecting the labels from the coder with the highest proficiency rating. Prior to calculating the proficiency score, all faces that were tagged as having ‘Poor’ or ‘Error’ image quality by either of the two coders were eliminated. Due to technical bugs when the AMT interface was first implemented, a small number of images were only labeled once; these were also eliminated from Datasets 2 and 3. In Dataset 3, we present the raw annotations for each coder that tagged each face, along with demographic data for each coder. Dataset 3 consists of two tables: the raw data from each of the two sets of coders, and the demographic information for each of the coders.
R
Face Detection With Yolov8 Dataset
universe.roboflow.com
zip
Updated Feb 18, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sodiq Ismoilov (2025). Face Detection With Yolov8 Dataset [Dataset]. https://universe.roboflow.com/sodiq-ismoilov/face-detection-with-yolov8
Explore at:
zipAvailable download formats
Dataset updated
Feb 18, 2025
Dataset authored and provided by
Sodiq Ismoilov
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Face Bounding Boxes
Description
Face Detection With Yolov8

## Overview Face Detection With Yolov8 is a dataset for object detection tasks - it contains Face annotations for 3,479 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
b
BioID-PTS-V1.2
bioid.com
Updated Nov 15, 2006
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BioID (2006). BioID-PTS-V1.2 [Dataset]. https://www.bioid.com/face-database/
Explore at:
Dataset updated
Nov 15, 2006
Dataset authored and provided by
BioID
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
FGnet Markup Scheme of the BioID Face Database - The BioID Face Database is being used within the FGnet project of the European Working Group on face and gesture recognition. David Cristinacce and Kola Babalola, PhD students from the department of Imaging Science and Biomedical Engineering at the University of Manchester marked up the images from the BioID Face Database. They selected several additional feature points, which are very useful for facial analysis and gesture recognition.
FaciaVox a Multimodal Biometric Dataset
zenodo.org
Updated Feb 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kamal Abuqaaud; Ali Bou Nassif; Ismail Shahin; Kamal Abuqaaud; Ali Bou Nassif; Ismail Shahin (2025). FaciaVox a Multimodal Biometric Dataset [Dataset]. http://doi.org/10.5281/zenodo.14861092
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.14861092
Dataset updated
Feb 13, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Kamal Abuqaaud; Ali Bou Nassif; Ismail Shahin; Kamal Abuqaaud; Ali Bou Nassif; Ismail Shahin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The FaciaVox dataset is an extensive multimodal biometric resource designed to enable in-depth exploration of face-image and voice recording research areas in both masked and unmasked scenarios.

Features of the Dataset:

1. Multimodal Data: A total of 1,800 face images (JPG) and 6,000 audio recordings (WAV) were collected, enabling cross-domain analysis of visual and auditory biometrics.

2. Participants were categorized into four age groups for structured labeling:
Label 1: Under 16 years
Label 2: 16 to less than 31 years
Label 3: 31 to less than 46 years
Label 4: 46 years and above

3. Sibling Data: Some participants are siblings, adding a challenging layer for speaker identification and facial recognition tasks due to genetic similarities in vocal and facial features. Sibling relationships are documented in the accompanying "FaciaVox List" data file.

4. Standardized Filenames: The dataset uses a consistent, intuitive naming convention for both facial images and voice recordings. Each filename includes:
Type (F: Face Image, V: Voice Recording)
Participant ID (e.g., sub001)
Mask Type (e.g., a: unmasked, b: disposable mask, etc.)
Zoom Level or Sentence ID (e.g., 1x, 3x, 5x for images or specific sentence identifier {01, 02, 03, ..., 10} for recordings)

5. Diverse Demographics: 19 different countries.

6. A challenging face recognition problem involving reflective mask shields and severe lighting conditions.

7. Each participant uttered 7 English statements and 3 Arabic statements, regardless of their native language. This adds a challenge for speaker identification.

Research Applications

FaciaVox is a versatile dataset supporting a wide range of research domains, including but not limited to:
• Speaker Identification (SI) and Face Recognition (FR): Evaluating biometric systems under varying conditions.
• Impact of Masks on Biometrics: Investigating how different facial coverings affect recognition performance.
• Language Impact on SI: Exploring the effects of native and non-native speech on speaker identification.
• Age and Gender Estimation: Inferring demographic information from voice and facial features.
• Race and Ethnicity Matching: Studying biometrics across diverse populations.
• Synthetic Voice and Deepfake Detection: Detecting cloned or generated speech.
• Cross-Domain Biometric Fusion: Combining facial and vocal data for robust authentication.
• Speech Intelligibility: Assessing how masks influence speech clarity.
• Image Inpainting: Reconstructing occluded facial regions for improved recognition.

Researchers can use the facial images and voice recordings independently or in combination to explore multimodal biometric systems. The standardized filenames and accompanying metadata make it easy to align visual and auditory data for cross-domain analyses. Sibling relationships and demographic labels add depth for tasks such as familial voice recognition, demographic profiling, and model bias evaluation.

Facebook

Twitter

Click to copy link

Link copied

Cite

University of Massachusetts (2022). LFW (Labeled Faces in the Wild) [Dataset]. https://opendatalab.com/OpenDataLab/LFW

LFW (Labeled Faces in the Wild)

OpenDataLab/LFW

Explore at:

478 scholarly articles cite this dataset (View in Google Scholar)

zip(8640421963 bytes)Available download formats

Dataset updated

Jul 31, 2022

Dataset provided by

University of Massachusetts

Description

Labeled Faces in the Wild, is a database of face photographs designed for studying the problem of unconstrained face recognition. The data set contains more than 13,000 images of faces collected from the web. Each face has been labeled with the name of the person pictured. 1680 of the people pictured have two or more distinct photos in the data set. The only constraint on these faces is that they were detected by the Viola-Jones face detector. More details can be found in the technical report below.

Clear search

Close search

Google apps

Main menu

LFW (Labeled Faces in the Wild)

Labeled Faces in the Wild aligned (LFW-a)

Face Detection Dataset

lfw

Face Label Dataset

Face Label

Happy Face Dataset

LFW – People (Face Recognition)

MALF (Multi-Attribute Labelled Faces)

Large-scale Labeled Faces (LSLF) Dataset.zip

Gender Detection & Classification - Face Dataset

Gender Detection & Classification - face recognition dataset

The dataset is created on the basis of Face Mask Detection dataset

💴 For Commercial Usage: Full version of the dataset includes 376 000+ photos of people, leave a request on TrainingData to buy the dataset

Metadata for the full dataset:

OTHER BIOMETRIC DATASETS:

💴 Buy the Dataset: This is just an example of the data. Leave a request on https://trainingdata.pro/datasets to learn about the price and buy the dataset

Content

File with the extension .csv

TrainingData provides high-quality data annotation tailored to your needs

BioID Face Database

Dataset for Smile Detection from Face Images

wider_face

Labelled Faces in the Wild with cropped faces

Context

Content

Inspiration

Labeled Faces in the Wild - aligned with funneling

Labeled Faces in the Wild - aligned with deep funneling

Dataset: Faces extracted from Time Magazine 1923-2014

Face Detection With Yolov8 Dataset

Face Detection With Yolov8

BioID-PTS-V1.2

FaciaVox a Multimodal Biometric Dataset

LFW (Labeled Faces in the Wild)

OpenDataLab/LFW

LFW (Labeled Faces in the Wild)

Labeled Faces in the Wild aligned (LFW-a)

Face Detection Dataset

lfw

Face Label Dataset

Face Label

Happy Face Dataset

LFW – People (Face Recognition)

MALF (Multi-Attribute Labelled Faces)

Large-scale Labeled Faces (LSLF) Dataset.zip

Gender Detection & Classification - Face Dataset

Gender Detection & Classification - face recognition dataset

The dataset is created on the basis of Face Mask Detection dataset

💴 For Commercial Usage: Full version of the dataset includes 376 000+ photos of people, leave a request on TrainingData to buy the dataset

Metadata for the full dataset:

OTHER BIOMETRIC DATASETS:

💴 Buy the Dataset: This is just an example of the data. Leave a request on https://trainingdata.pro/datasets to learn about the price and buy the dataset

Content

File with the extension .csv

TrainingData provides high-quality data annotation tailored to your needs

BioID Face Database

Dataset for Smile Detection from Face Images

wider_face

Labelled Faces in the Wild with cropped faces

Context

Content

Inspiration

Labeled Faces in the Wild - aligned with funneling

Labeled Faces in the Wild - aligned with deep funneling

Dataset: Faces extracted from Time Magazine 1923-2014

Face Detection With Yolov8 Dataset

Face Detection With Yolov8

BioID-PTS-V1.2

FaciaVox a Multimodal Biometric Dataset

LFW (Labeled Faces in the Wild)See More Versions

OpenDataLab/LFW

LFW (Labeled Faces in the Wild)