Facial Keypoint Detection Dataset for biometric verification, facial recognition security, and realistic AR/VR experiences
This dataset was created by Tom Nguyen
It contains the following files:
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Dataset to detect location of landmarks on a face..
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The development of facial expression recognition (FER) and facial expression generation (FEG) systems is essential to enhance human-robot interactions (HRI). The facial action coding system is widely used in FER and FEG tasks, as it offers a framework to relate the action of facial muscles and the resulting facial motions to the execution of facial expressions. However, most FER and FEG studies are based on measuring and analyzing facial motions, leaving the facial muscle component relatively unexplored. This study introduces a novel framework using surface electromyography (sEMG) signals from facial muscles to recognize facial expressions and estimate the displacement of facial keypoints during the execution of the expressions. For the facial expression recognition task, we studied the coordination patterns of seven muscles, expressed as three muscle synergies extracted through non-negative matrix factorization, during the execution of six basic facial expressions. Muscle synergies are groups of muscles that show coordinated patterns of activity, as measured by their sEMG signals, and are hypothesized to form the building blocks of human motor control. We then trained two classifiers for the facial expressions based on extracted features from the sEMG signals and the synergy activation coefficients of the extracted muscle synergies, respectively. The accuracy of both classifiers outperformed other systems that use sEMG to classify facial expressions, although the synergy-based classifier performed marginally worse than the sEMG-based one (classification accuracy: synergy-based 97.4%, sEMG-based 99.2%). However, the extracted muscle synergies revealed common coordination patterns between different facial expressions, allowing a low-dimensional quantitative visualization of the muscle control strategies involved in human facial expression generation. We also developed a skin-musculoskeletal model enhanced by linear regression (SMSM-LRM) to estimate the displacement of facial keypoints during the execution of a facial expression based on sEMG signals. Our proposed approach achieved a relatively high fidelity in estimating these displacements (NRMSE 0.067). We propose that the identified muscle synergies could be used in combination with the SMSM-LRM model to generate motor commands and trajectories for desired facial displacements, potentially enabling the generation of more natural facial expressions in social robotics and virtual reality.
This dataset was created by Julian Lenkiewicz
Few days ago i was thinking to start some new project but couldn't find one that looks a bit exciting to me. So , then i found about facial landmarks , then i started to found some datasets for it . There were many datasets , but Flickr Dataset came out to be the best out of them with 70,000 images having 68 landmarks coefficients and as the size shows the data was a big too around 900 GB , so i decided to form a smaller version of it so that we are able to atleast work on such task. So i created this dataset.
The objective of creating this dataset is to predict keypoint positions on face images. This can be used as a building block in several applications, such as:
Detecing facial keypoints is a very challenging problem. Facial features vary greatly from one individual to another, and even for a single individual, there is a large amount of variation due to 3D pose, size, position, viewing angle, and illumination conditions. Computer vision research has come a long way in addressing these difficulties, but there remain many opportunities for improvement.
Some Sample images
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F2137176%2Fdb17e16db7aefd0848ca3acd99001262%2Fdownload.png?generation=1608374055920310&alt=media">
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F2137176%2Fdfa119b710b9edb47f0f6b2326b4cbdd%2Fdownload_1.png?generation=1608374048827571&alt=media">
Actual Dataset can be seen at https://github.com/NVlabs/ffhq-dataset
This dataset contains 6000 records in two files :
1. A json file having below format
{'face_landmarks': [[191.5, 617.5],[210.5, 717.5], ...............],
'file_name': '00000.png'}
The individual images were published in Flickr by their respective authors under either Creative Commons BY 2.0, Creative Commons BY-NC 2.0, Public Domain Mark 1.0, Public Domain CC0 1.0, or U.S. Government Works license. All of these licenses allow free use, redistribution, and adaptation for non-commercial purposes. However, some of them require giving appropriate credit to the original author, as well as indicating any changes that were made to the images. The license and original author of each image are indicated in the metadata.
https://creativecommons.org/licenses/by/2.0/ https://creativecommons.org/licenses/by-nc/2.0/ https://creativecommons.org/publicdomain/mark/1.0/ https://creativecommons.org/publicdomain/zero/1.0/ http://www.usa.gov/copyright.shtml The dataset itself (including JSON metadata, download script, and documentation) is made available under Creative Commons BY-NC-SA 4.0 license by NVIDIA Corporation. You can use, redistribute, and adapt it for non-commercial purposes, as long as you (a) give appropriate credit by citing our paper, (b) indicate any changes that you've made, and (c) distribute any derivative works under the same license.
https://creativecommons.org/licenses/by-nc-sa/4.0/
Its takes a lot of time and resources to generate this dataset in one run. So , i need to run it multiple times generating different subsets ,hence it takes a lot of time to complete it. Date : 19/12/2020 Currently it has 6000 images and respective metadata. Date : 19/12/2020 Currently it has 10000 images and respective metadata. Date : 23/12/2020 updated correctly it has 5000 images and respective metadata.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Facial and key point detection in dairy cows can assist farms in building recognition systems and estimating cow facial postures. This dataset was primarily collected in Lu'an, Anhui Province, and Huai'an, Jiangsu Province.This dataset contains 2,538 images of Holstein cow faces under various conditions, including different lighting, occlusions, levels of blurriness, angles, as well as flipping, noise, and both single and multiple cows.The Labelme software was used to annotate the cow's facial detection bounding box and five key points including the left and right eyes, nose, and the corners of the mouth, which helps to advance the development of cow facial detection and pose estimation.
CASIA-Face-Africa is a face image database which contains 38,546 images of 1,183 African subjects. Multi-spectral cameras are utilized to capture the face images under various illumination settings. Demographic attributes and facial expressions of the subjects are also carefully recorded. For landmark detection, each face image in the database is manually labeled with 68 facial keypoints. A group of evaluation protocols are constructed according to different applications, tasks, partitions and scenarios. The proposed database along with its face landmark annotations, evaluation protocols and preliminary results form a good benchmark to study the essential aspects of face biometrics for African subjects, especially face image preprocessing, face feature analysis and matching, facial expression recognition, sex/age estimation, ethnic classification, face image generation, etc.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Body Face Key Points is a dataset for computer vision tasks - it contains Person annotations for 942 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy
The global face key point detection market reached a value of million USD in 2025, and is expected to reach million USD by 2033, exhibiting a CAGR of XX% during the forecast period. Increasing advancements in augmented reality (AR) and virtual reality (VR) technologies are boosting the market growth. Growing adoption of facial recognition systems in various application areas, including smartphone unlocking, user authentication, and video surveillance, is driving the demand for face key point detection solutions. Moreover, the rising need for accurate and efficient methods for face alignment and expression recognition in image processing applications is contributing to the market growth. The holistic approach segment held the largest share of the market in 2025, and is expected to maintain its dominance during the forecast period. This can be attributed to the high accuracy and reliability of the holistic approach for estimating facial key points. However, the regression-based methods segment is anticipated to exhibit the highest CAGR over the forecast period, owing to the increasing adoption of deep learning and machine learning techniques for face key point detection.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
This project reannotated the Celebrities in Frontal-Profile (CFP) [1] dataset. It selected all the profile face images (2000 images) from the CFP dataset, aligned their facial directions to the same orientation, and annotated them with a target box (side face) and five keypoints (tragus, eye corner, nose tip, upper lip, and mouth corner). The annotation format used is the JSON format of Labelme. Please note that this project involved solely the reannotation of an existing dataset. [1] Sengupta, S., Chen, J. C., Castillo, C., Patel, V. M., Chellappa, R., & Jacobs, D. W. (2016, March). Frontal to profile face verification in the wild. In 2016 IEEE winter conference on applications of computer vision (WACV) (pp. 1-9). IEEE.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Indoor Facial 182 Keypoints -aineisto on internet-, media-, viihde- ja mobiilialoille suunnattu erikoisresurssi, joka keskittyy yksityiskohtaiseen kasvojen analysointiin. Se sisältää kuvia 50 henkilöstä sisätiloissa, sukupuolijakauma on tasapainoinen ja ikähaarukka 18–50 vuotta. Jokaiset kasvot on merkitty 182 avainpisteellä, mikä helpottaa kasvonpiirteiden tarkkaa seurantaa ja analysointia.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Neuropsychological research aims to unravel how diverse individuals’ brains exhibit similar functionality when exposed to the same stimuli. The evocation of consistent responses when different subjects watch the same emotionally evocative stimulus has been observed through modalities like fMRI, EEG, physiological signals and facial expressions. We refer to the quantification of these shared consistent signals across subjects at each time instant across the temporal dimension as Consistent Response Measurement (CRM). CRM is widely explored through fMRI, occasionally with EEG, physiological signals and facial expressions using metrics like Inter-Subject Correlation (ISC). However, fMRI tools are expensive and constrained, while EEG and physiological signals are prone to facial artifacts and environmental conditions (such as temperature, humidity, and health condition of subjects). In this research, facial expression videos are used as a cost-effective and flexible alternative for CRM, minimally affected by external conditions. By employing computer vision-based automated facial keypoint tracking, a new metric similar to ISC, called the Average t-statistic, is introduced. Unlike existing facial expression-based methodologies that measure CRM of secondary indicators like inferred emotions, keypoint, and ICA-based features, the Average t-statistic is closely associated with the direct measurement of consistent facial muscle movement using the Facial Action Coding System (FACS). This is evidenced in DISFA dataset where the time-series of Average t-statistic has a high correlation (R2 = 0.78) with a metric called AU consistency, which directly measures facial muscle movement through FACS coding of video frames. The simplicity of recording facial expressions with the automated Average t-statistic expands the applications of CRM such as measuring engagement in online learning, customer interactions, etc., and diagnosing outliers in healthcare conditions like stroke, autism, depression, etc. To promote further research, we have made the code repository publicly available.
We introduce a new bimodal dataset recorded during affect elicitation by means of audio-visual stimuli for human emotion recognition based on facial and corporal expressions. Our dataset was collected using three devices: an RGB camera, Kinect 1, and Kinect 2. The Kinect 1 and Kinect 2 sensors provide 121 and 1347 face key points, respectively, offering a more comprehensive analysis of facial expressions. Additionally, for the 2D RGB sequences, we utilized the feature points provided by the open-source OpenFace, which includes 2D 68 facial landmarks. From these landmarks, we selected 26 facial points that were most relevant for our emotion recognition task. To gather the data, we conducted experiments involving 17 participants. We captured both facial and skeleton keypoints, allowing for a comprehensive understanding of the participants' emotional expressions. By combining the RGB and RGB-D data from the various devices, our dataset provides a rich and diverse set of information for human emotion recognition research. This new dataset not only expands the available resources for studying human emotions but also offers a more detailed analysis with the increased number of facial keypoints provided by the Kinect sensors. Researchers can leverage this dataset to develop and evaluate more accurate and robust models for human emotion recognition, ultimately advancing our understanding of how emotions are expressed through facial and corporal cues. Please cite as: K. Amara, O. Kerdjidj and N. Ramzan, "Emotion Recognition for Affective human digital twin by means of virtual reality enabling technologies," in IEEE Access, doi: 10.1109/ACCESS.2023.3285398.
Please state your name, contact details (e-mail), institution, and position, as well as the reason for requesting access to our database.
For additional info contact:
kahina.amara88@gmail.com or kamara@cdta.dz
Naeem.Ramzan@uws.ac.uk
okerdjidj@ud.ac.ae
The CUHK Face Alignment Database is dataset with 13,466 face images, among which 5, 590 images are from LFW and the remaining 7, 876 images are downloaded from the web. Each face is labeled with the positions of five facial keypoints. 10,000 images are used for training and the remaining 3,466 images for validation.
This dataset was created by Nguyễn Hoàng Phúc
Angle: no more than 90 degree All of the contents is sourced from PIXTA's stock library of 100M+ Asian-featured images and videos.
Annotated Imagery Data of Face ID + 106 key points facial landmark This dataset contains 30,000+ images of Face ID + 106 key points facial landmark. The dataset has been annotated in - face bounding box, Attribute of race, gender, age, skin tone and 106 keypoints facial landmark. Each data set is supported by both AI and human review process to ensure labelling consistency and accuracy.
About PIXTA PIXTASTOCK is the largest Asian-featured stock platform providing data, contents, tools and services since 2005. PIXTA experiences 15 years of integrating advanced AI technology in managing, curating, processing over 100M visual materials and serving global leading brands for their creative and data demands.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The Indoor Facial 182 Keypoints Dataset dia loharano manokana ho an'ny Internet, haino aman-jery, fialamboly ary indostrian'ny finday, mifantoka amin'ny famakafakana tarehy amin'ny antsipiriany. Ahitana sarin'olona 50 ao anatin'ny sehatra anatiny, miaraka amin'ny fizarana lahy sy vavy voalanjalanja ary ny taona eo anelanelan'ny 18 ka hatramin'ny 50. Ny tarehy tsirairay dia voasokajy miaraka amin'ny hevi-dehibe 182, manamora ny fanaraha-maso sy ny famakafakana ny endriky ny tarehy.
COCO-WholeBody is an extension of COCO dataset with whole-body annotations. There are 4 types of bounding boxes (person box, face box, left-hand box, and right-hand box) and 133 keypoints (17 for body, 6 for feet, 68 for face and 42 for hands) annotations for each person in the image.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Neuropsychological research aims to unravel how diverse individuals’ brains exhibit similar functionality when exposed to the same stimuli. The evocation of consistent responses when different subjects watch the same emotionally evocative stimulus has been observed through modalities like fMRI, EEG, physiological signals and facial expressions. We refer to the quantification of these shared consistent signals across subjects at each time instant across the temporal dimension as Consistent Response Measurement (CRM). CRM is widely explored through fMRI, occasionally with EEG, physiological signals and facial expressions using metrics like Inter-Subject Correlation (ISC). However, fMRI tools are expensive and constrained, while EEG and physiological signals are prone to facial artifacts and environmental conditions (such as temperature, humidity, and health condition of subjects). In this research, facial expression videos are used as a cost-effective and flexible alternative for CRM, minimally affected by external conditions. By employing computer vision-based automated facial keypoint tracking, a new metric similar to ISC, called the Average t-statistic, is introduced. Unlike existing facial expression-based methodologies that measure CRM of secondary indicators like inferred emotions, keypoint, and ICA-based features, the Average t-statistic is closely associated with the direct measurement of consistent facial muscle movement using the Facial Action Coding System (FACS). This is evidenced in DISFA dataset where the time-series of Average t-statistic has a high correlation (R2 = 0.78) with a metric called AU consistency, which directly measures facial muscle movement through FACS coding of video frames. The simplicity of recording facial expressions with the automated Average t-statistic expands the applications of CRM such as measuring engagement in online learning, customer interactions, etc., and diagnosing outliers in healthcare conditions like stroke, autism, depression, etc. To promote further research, we have made the code repository publicly available.
Facial Keypoint Detection Dataset for biometric verification, facial recognition security, and realistic AR/VR experiences