Facebook
TwitterThis dataset was created by Sunil G
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Recently, many applications from biometrics,to entertainment use the information extracted from face images that contain information about age, gender, ethnic background, and emotional state. Automatic age estimation from facial images is one of the popular and challenging tasks that have different fields of applications such as controlling the content of the watched media depending on the customer's age. So facial feature analysis has been a topic of interest mainly due to its applicability and Deep Learning techniques are now making it possible for face analysis to be not just a dream but a reality. This simple practice dataset can get you more acquainted with application of deep learning in age detection. #
https://media.gettyimages.com/photos/facial-recognition-technology-picture-id1139859279?k=6&m=1139859279&s=612x612&w=0&h=H-i0yAM3A49I_r44424-jACD667nxiKb7bZR52ByOA=" alt="im">
#
Indian Movie Face database (IMFDB) is a large unconstrained face database consisting of 34512 images of 100 Indian actors collected from more than 100 videos. All the images are manually selected and cropped from the video frames resulting in a high degree of variability interms of scale, pose, expression, illumination, age, resolution, occlusion, and makeup. IMFDB is the first face database that provides a detailed annotation of every image in terms of age, pose, gender, expression and type of occlusion that may help other face related applications.
The dataset provided a total of 19906 images.The attributes of data are as follows:
#
https://ars.els-cdn.com/content/image/1-s2.0-S0925231215017348-gr1.jpg" alt="face">
image ref : Automatic age estimation based on CNN
#
CVIT focuses on basic and advanced research in image processing, computer vision, computer graphics and machine learning. This center deals with the generation, processing, and understanding of primarily visual data as well as with the techniques and tools required doing so efficiently. The activity of this center overlaps the traditional areas of Computer Vision, Image Processing, Computer Graphics, Pattern Recognition and Machine Learning. CVIT works on both theoretical as well as practical aspects of visual information processing. Center aims to keep the right balance between the cutting edge academic research and impactful applied research.
The main task is to predict the age of a person from his or her facial attributes. For simplicity, the problem has been converted to a multiclass problem with classes as Young, Middle and Old.
UTKFace dataset is a large-scale face dataset with long age span (range from 0 to 116 years old). The dataset consists of over 20,000 face images with annotations of age, gender, and ethnicity. The images cover large variation in pose, facial expression, illumination, occlusion, resolution, etc. This dataset could be used on a variety of tasks, e.g., face detection, age estimation, age progression/regression, landmark localization, etc. Some sample images are shown as following:
https://susanqq.github.io/UTKFace/icon/samples.png" alt="face2">
Complete Dataset: https://susanqq.github.io/UTKFace/
The labels of each face image is embedded in the file name, formated like [age]_[gender]_[race]_[date&time].jpg
*If you download and find the data useful your upvote is an explicit feedback for future works*
Facebook
Twitterhttps://ibug.doc.ic.ac.uk/resources/300-W/https://ibug.doc.ic.ac.uk/resources/300-W/
The 300-W is a face dataset that consists of 300 Indoor and 300 Outdoor in-the-wild images. It covers a large variation of identity, expression, illumination conditions, pose, occlusion and face size. The images were downloaded from google.com by making queries such as “party”, “conference”, “protests”, “football” and “celebrities”. Compared to the rest of in-the-wild datasets, the 300-W database contains a larger percentage of partially-occluded images and covers more expressions than the common “neutral” or “smile”, such as “surprise” or “scream”. Images were annotated with the 68-point mark-up using a semi-automatic methodology. The images of the database were carefully selected so that they represent a characteristic sample of challenging but natural face instances under totally unconstrained conditions. Thus, methods that achieve accurate performance on the 300-W database can demonstrate the same accuracy in most realistic cases. Many images of the database contain more than one annotated faces (293 images with 1 face, 53 images with 2 faces and 53 images with [3, 7] faces). Consequently, the database consists of 600 annotated face instances, but 399 unique images. Finally, there is a large variety of face sizes. Specifically, 49.3% of the faces have size in the range [48.6k, 2.0M] and the overall mean size is 85k (about 292 × 292) pixels.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
The Cat Facial Landmarks in the Wild (CatFLW) dataset contains 2079 images of cats' faces in various environments and conditions, annotated with 48 facial landmarks and a bounding box on the cat’s face.
If you use the CatFLW, please cite the dataset paper:
@article{martvel2023catflw,
title={Catflw: Cat facial landmarks in the wild dataset},
author={Martvel, George and Farhat, Nareed and Shimshoni, Ilan and Zamansky, Anna},
journal={arXiv preprint arXiv:2305.04232},
year={2023}
}
You can also check out the landmark detection paper and compare the detection performance on the CatFLW:
@article{martvel2024automated,
title={Automated Detection of Cat Facial Landmarks},
author={Martvel, George and Shimshoni, Ilan and Zamansky, Anna},
journal={International Journal of Computer Vision},
pages={1--16},
year={2024},
publisher={Springer}
}
Facebook
TwitterThe Expression in-the-Wild (ExpW) Dataset is a comprehensive and diverse collection of facial images carefully curated to capture spontaneous and unscripted facial expressions exhibited by individuals in real-world scenarios. This extensively annotated dataset serves as a valuable resource for advancing research in the fields of computer vision, facial expression analysis, affective computing, and human behavior understanding.
Real-world Expressions: The ExpW dataset stands apart from traditional lab-controlled datasets as it focuses on capturing facial expressions in real-life environments. This authenticity ensures that the dataset reflects the natural diversity of emotions experienced by individuals in everyday situations, making it highly relevant for real-world applications.
Large and Diverse: Comprising a vast number of images, the ExpW dataset encompasses an extensive range of subjects, ethnicities, ages, and genders. This diversity allows researchers and developers to build more robust and inclusive models for facial expression recognition and emotion analysis.
Annotated Emotions: Each facial image in the dataset is meticulously annotated with corresponding emotion labels, including but not limited to happiness, sadness, anger, surprise, fear, disgust, and neutral expressions. The emotion annotations provide ground truth data for training and validating machine learning algorithms.
Various Pose and Illumination: To account for the varying challenges posed by real-life scenarios, the ExpW dataset includes images captured under different lighting conditions and poses. This variability helps researchers create algorithms that are robust to changes in illumination and head orientation.
Privacy and Ethics: ExpW has been compiled adhering to strict privacy and ethical guidelines, ensuring the subjects' consent and data protection. The dataset maintains a high level of anonymity by excluding any personal information or sensitive details.
This dataset has been downloaded from the following Public Directory... https://drive.google.com/drive/folders/1SDcI273EPKzzZCPSfYQs4alqjL01Kybq
Dataset contains 91,793 faces manually labeled with expressions (Figure 1). Each of the face images is annotated as one of the seven basic expression categories: “angry (0)”, “disgust (1)”, “fear (2)”, “happy (3)”, “sad (4)”, “surprise (5)”, or “neutral (6)”.
Facebook
TwitteriBugMask is an in-the-wild face parsing dataset that contains 1,000 challenging face images and manually annotated labels for 11 semantic classes: background, facial skin, left/right brow, left/right eye, nose, upper/lower lip, inner mouth, and hair. The images are curated from challenging in-the-wild face alignment datasets, including 300W and Menpo. Compared with the existing face parsing datasets, iBugMask contains in-the-wild scenarios such as “party” and “conference”, which include more challenging appearance variations or multiple faces. There is a larger number of profile faces. More expressions other than ”neutral” and ”smile” are also included (e.g. ”surprise” and ”scream”).
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
The Dog Facial Landmarks in the Wild (DogFLW) dataset contains 4335 images of dogs' faces in various environments and conditions, annotated with 46 facial landmarks and a bounding box on the dog’s face.
If you use the DogFLW, please cite the paper:
@article{martvel2025dog,
title={Dog facial landmarks detection and its applications for facial analysis},
author={Martvel, George and Zamansky, Anna and Pedretti, Giulia and Canori, Chiara and Shimshoni, Ilan and Bremhorst, Annika},
journal={Scientific Reports},
volume={15},
number={1},
pages={21886},
year={2025},
publisher={Nature Publishing Group UK London}
}
You can also check out the cat landmark detection paper:
@article{martvel2024automated,
title={Automated Detection of Cat Facial Landmarks},
author={Martvel, George and Shimshoni, Ilan and Zamansky, Anna},
journal={International Journal of Computer Vision},
pages={1--16},
year={2024},
publisher={Springer}
}
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Binary semantic segmentation dataset for glasses. This is just a specific category, i.e., GLASSES or ID=16, of the original Face Synthetics dataset introduced in Fake It Till You Make It: Face analysis in the wild using synthetic data alone.
The dataset has the following structure (a total of 14,303 256x256 face-centered images and corresponding masks):
bash
└── face-synthetics-glasses
├── test
│ ├── images <- 1450 (about 10%) of 256x256 test images
│ └── masks <- 1450 (about 10%) of 256x256 corresponding masks
│
├── train
│ ├── images <- 11372 (about 80%) of 256x256 train images
│ └── masks <- 11372 (about 80%) of 256x256 corresponding masks
│
└── val
├── images <- 1481 (about 10%) of 256x256 validation images
└── masks <- 1481 (about 10%) of 256x256 corresponding masks
The dataset was collected in the following way (full processing script is available in this gist):
Step 1: The original dataset parts were downloaded from the official links, then unzipped:
cat dataset_100000.zip.* > dataset_100000.zip
unzip dataset_100000.zip
Step 2: Extracted only those image and mask pairs for which glasses exist and split them to train, test, and val directories (files chosen randomly)
Step 3: Applied Face Crop Plus with the provided landmarks to crop only the face area with a varying face factor from 0.65 to 0.95
@misc{face-synthetics-glasses,
author = {{Kaggle Contributors}},
title = {Face Synthetics Glasses},
year = {2024},
publisher = {Kaggle},
journal = {Kaggle datasets},
howpublished = {\url{https://www.kaggle.com/datasets/mantasu/face-synthetics-glasses}}
}
@inproceedings{wood2021fake,
title={Fake it till you make it: face analysis in the wild using synthetic data alone},
author={Wood, Erroll and Baltru{\v{s}}aitis, Tadas and Hewitt, Charlie and Dziadzio, Sebastian and Cashman, Thomas J and Shotton, Jamie},
booktitle={Proceedings of the IEEE/CVF international conference on computer vision},
pages={3681--3691},
year={2021}
}
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterThis dataset was created by Sunil G