100+ datasets found

Data from: Human Faces Dataset
kaggle.com
Updated Aug 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kaustubh Dhote (2024). Human Faces Dataset [Dataset]. https://www.kaggle.com/datasets/kaustubhdhote/human-faces-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 26, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Kaustubh Dhote
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
The dataset contains around 9.6k images of human faces which are both real images and those generated by AI.

The zip contains two folders: - Real Images: 5000 images of real human faces - AI-Generated Images: 4630 images of ai-generated human faces.
Custom Face Recognition Image Dataset
kaggle.com
zip
Updated Jul 3, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Unidata (2025). Custom Face Recognition Image Dataset [Dataset]. https://www.kaggle.com/datasets/unidpro/face-recognition-image-dataset
Explore at:
zip(27609695 bytes)Available download formats
Dataset updated
Jul 3, 2025
Authors
Unidata
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
Image Dataset of face images for compuer vision tasks

Dataset comprises 500,600+ images of individuals representing various races, genders, and ages, with each person having a single face image. It is designed for facial recognition and face detection research, supporting the development of advanced recognition systems.

By leveraging this dataset, researchers and developers can enhance deep learning models, improve face verification and face identification techniques, and refine detection algorithms for more accurate recognizing faces in real-world scenarios. - Get the data

Metadata for the dataset

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F22059654%2F87acb75b060abcd7838e8a9fad21fb79%2FFrame%201%20(8).png?generation=1743153407873743&alt=media" alt=""> All images come with rigorously verified metadata annotations (age, gender, ethnicity), achieving ≥95% labeling accuracy. Also images are captured under different lighting conditions and resolutions, enhancing the dataset's utility for computer vision tasks and image classifications.

💵 Buy the Dataset: This is a limited preview of the data. To access the full dataset, please contact us at https://unidata.pro to discuss your requirements and pricing options.

Researchers can leverage this dataset to improve recognition technology and develop learning models that enhance the accuracy of face detections. The dataset also supports projects focused on face anti-spoofing and deep learning applications, making it an essential tool for those studying biometric security and liveness detection technologies.

🌐 UniData provides high-quality datasets, content moderation, data collection and annotation for your AI/ML projects
Male Faces - Image Dataset
kaggle.com
zip
Updated May 2, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Unique Data (2024). Male Faces - Image Dataset [Dataset]. https://www.kaggle.com/datasets/trainingdatapro/male-selfie-image-dataset
Explore at:
zip(66375081 bytes)Available download formats
Dataset updated
May 2, 2024
Authors
Unique Data
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
Face Recognition, Face Detection, Male Photo Dataset 👨

The dataset is created on the basis of Selfies and ID Dataset

110,000+ photos of 74,000+ men from 141 countries. The dataset includes photos of people's faces. All people presented in the dataset are men. The dataset contains a variety of images capturing individuals from diverse backgrounds and age groups.

Our dataset will diversify your data by adding more photos of men of different ages and ethnic groups, enhancing the quality of your model.

People in the dataset https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2F4b36d906144803b5f4b1fb6bbb17246c%2FFrame%20109.png?generation=1714650925000102&alt=media" alt="">

The dataset can be utilized for a wide range of tasks, including face recognition, age estimation, image feature extraction, or any problem related to human image analysis.

👉 Legally sourced datasets and carefully structured for AI training and model development. Explore samples from our dataset of 95,000+ human images & videos - Full dataset

Metadata for the dataset:

id - unique identifier of the media file

photo - link to access the photo,

age - age of the person

country - country of the person

ethnicity - ethnicity of the person

photo_extension - photo extension,

photo_resolution - photo resolution

Statistics for the dataset

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2Fc7e8e8029a7e65e7f6f2ccc53e1b6f5d%2FMale%20Images.png?generation=1714650553018057&alt=media" alt="">

🧩 This is just an example of the data. Leave a request here to learn more

Content

The dataset consists of: - files - includes 20 images corresponding to each person in the sample, - .csv file - contains information about the images and people in the dataset

File with the extension .csv

id: id of the person,

age - age of the person,

country - country of the person,

ethnicity - ethnicity of the person,

photo_extension: extension of the photo,

photo_resolution: photo_resolution of the photo

🚀 You can learn more about our high-quality unique datasets here

keywords: biometric system, biometric dataset, face recognition database, face recognition dataset, face detection dataset, facial analysis, object detection dataset, deep learning datasets, computer vision datset, human images dataset, human faces dataset, machine learning, image-to-image, verification models, digital photo-identification, men images, males dataset, male selfie, male face recognition
AI-Face-Dataset-3000_Images
kaggle.com
zip
Updated Aug 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Muhammad Shavaiz (2024). AI-Face-Dataset-3000_Images [Dataset]. https://www.kaggle.com/datasets/shavaizbutt/ai-face-dataset-3000-images
Explore at:
zip(3972046713 bytes)Available download formats
Dataset updated
Aug 26, 2024
Authors
Muhammad Shavaiz
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This dataset is a curated subset of 3000 images extracted from a larger collection of approximately 80,000 AI-generated faces. It features diverse, synthetic facial images created using advanced generative models, each with unique characteristics and expressions. Designed for focused testing and smaller-scale machine learning tasks, this subset offers a manageable sample size for experimentation with facial recognition and model validation. For broader applications and comprehensive studies, refer to the full dataset available at Original Dataset.

To access images in the ai-face-dataset-3000-images directory on Kaggle, list the files using os.listdir('/kaggle/input/ai-face-dataset-3000-images'). You can then load and process an image using libraries like PIL with Image.open('/kaggle/input/ai-face-dataset-3000-images/your-image-file.jpg').
Face Re-identification Image Dataset
kaggle.com
zip
Updated Jul 7, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Unidata (2025). Face Re-identification Image Dataset [Dataset]. https://www.kaggle.com/datasets/unidpro/face-re-identification-image-dataset
Explore at:
zip(17758297 bytes)Available download formats
Dataset updated
Jul 7, 2025
Authors
Unidata
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
Dataset of face images with different angles and head positions

Dataset contains 23,110 individuals, each contributing 28 images featuring various angles and head positions, diverse backgrounds, and attributes, along with 1 ID photo. In total, the dataset comprises over 670,000 images in formats such as JPG and PNG. It is designed to advance face recognition and facial recognition research, focusing on person re-identification and recognition systems.

By utilizing this dataset, researchers can explore various recognition applications, including face verification, face identification. - Get the data

Metadata for the dataset

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F22059654%2Fed374cc92b935b209749cb7b32fd41da%2FFrame%201%20(10).png?generation=1743160276352983&alt=media" alt=""> The accuracy of labels of face pose is more than 97%, ensuring reliable data for training and testing recognition algorithms.

💵 Buy the Dataset: This is a limited preview of the data. To access the full dataset, please contact us at https://unidata.pro to discuss your requirements and pricing options.

Dataset includes high-quality images that capture human faces in different poses and expressions, allowing for comprehensive analysis in recognition tasks. It is particularly valuable for developing and evaluating deep learning models and computer vision techniques.

🌐 UniData provides high-quality datasets, content moderation, data collection and annotation for your AI/ML projects
Human faces and object dataset
kaggle.com
zip
Updated Apr 25, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kapil Ishwarkar (2025). Human faces and object dataset [Dataset]. https://www.kaggle.com/datasets/kapilishwarkar/human-faces-and-object-dataset
Explore at:
zip(196077869 bytes)Available download formats
Dataset updated
Apr 25, 2025
Authors
Kapil Ishwarkar
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset Description: Human Faces and Objects Dataset (HFO-5000) The Human Faces and Objects Dataset (HFO-5000) is a curated collection of 5,000 images, categorized into three distinct classes: male faces (1,500), female faces (1,500), and objects (2,000). This dataset is designed for machine learning and computer vision applications, including image classification, face detection, and object recognition. The dataset provides high-quality, labeled images with a structured CSV file for seamless integration into deep learning pipelines.

Column Description: The dataset is accompanied by a CSV file that contains essential metadata for each image. The CSV file includes the following columns: file_name: The name of the image file (e.g., image_001.jpg). label: The category of the image, with three possible values: "male" (for male face images) "female" (for female face images) "object" (for images of various objects) file_path: The full or relative path to the image file within the dataset directory.

Uniqueness and Key Features: 1) Balanced Distribution: The dataset maintains an even distribution of human faces (male and female) to minimize bias in classification tasks. 2) Diverse Object Selection: The object category consists of a wide variety of items, ensuring robustness in distinguishing between human and non-human entities. 3) High-Quality Images: The dataset consists of clear and well-defined images, suitable for both training and testing AI models. 4) Structured Annotations: The CSV file simplifies dataset management and integration into machine learning workflows. 5) Potential Use Cases: This dataset can be used for tasks such as gender classification, facial recognition benchmarking, human-object differentiation, and transfer learning applications.

Conclusion: The HFO-5000 dataset provides a well-structured, diverse, and high-quality set of labeled images that can be used for various computer vision tasks. Its balanced distribution of human faces and objects ensures fairness in training AI models, making it a valuable resource for researchers and developers. By offering structured metadata and a wide range of images, this dataset facilitates advancements in deep learning applications related to facial recognition and object classification.
50K Celebrity Faces Image Dataset
kaggle.com
Updated Aug 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Farzad Nekouei (2023). 50K Celebrity Faces Image Dataset [Dataset]. https://www.kaggle.com/datasets/farzadnekouei/50k-celebrity-faces-image-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 3, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Farzad Nekouei
Description
This dataset is a curated subset of the CelebFaces Attributes (CelebA) Dataset, handpicked for deep learning tasks such as image synthesis and facial recognition. It includes 50,000 celebrity face images from diverse identities, covering a wide range of poses, backgrounds, and facial attributes. These images are suitable for experimenting with GANs, facial recognition models, and other machine learning tasks related to face analysis.

This dataset is perfect for hobbyists, researchers, and machine learning practitioners looking to experiment with a manageable yet diverse collection of celebrity face images.
F
Middle Eastern Occluded Facial Image Dataset
futurebeeai.com
wav
Updated Aug 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Middle Eastern Occluded Facial Image Dataset [Dataset]. https://www.futurebeeai.com/dataset/image-dataset/facial-images-occlusion-middle-east
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Dataset funded by
FutureBeeAI
Description
Introduction
Welcome to the Middle Eastern Human Face with Occlusion Dataset, carefully curated to support the development of robust facial recognition systems, occlusion detection models, biometric identification technologies, and KYC verification tools. This dataset provides real-world variability by including facial images with common occlusions, helping AI models perform reliably under challenging conditions.
Facial Image Data
The dataset comprises over 3,000 high-quality facial images, organized into participant-wise sets. Each set includes:
•
Occluded Images: 5 images per individual featuring different types of facial occlusions, masks, caps, sunglasses, or combinations of these accessories

•
Normal Image: 1 reference image of the same individual without any occlusion

Diversity & Representation
•
Geographic Coverage: Participants from across Egypt, Jordan, Suadi Arabia, UAE, Tunisia, and more Middle Eastern countries

•
Demographics: Individuals aged 18 to 70 years, with a 60:40 male-to-female ratio

•
File Formats: Images available in JPEG and HEIC formats

Image Quality & Capture Conditions
To ensure robustness and real-world utility, images were captured under diverse conditions:
•
Lighting Variations: Includes both natural and artificial lighting scenarios

•
Background Diversity: Indoor and outdoor backgrounds for model generalization

•
Device Quality: Captured using the latest smartphones to ensure high resolution and consistency

Metadata
Each image is paired with detailed metadata to enable advanced filtering, model tuning, and analysis:
•Unique Participant ID
•File Name
•Age
•Gender
•Country
•Demographic Profile
•Type of Occlusion
•File Format
This rich metadata helps train models that can recognize faces even when partially obscured.
Use Cases & Applications
This dataset is ideal for a wide range of real-world and research-focused applications, including:
•
Facial Recognition under Occlusion: Improve model performance when faces are partially hidden

•
Occlusion Detection: Train systems to detect and classify facial accessories like masks or sunglasses

•
Biometric Identity Systems: Enhance verification accuracy across varying conditions

•
KYC & Compliance: Support face matching even when the selfie includes common occlusions.

•
Security & Surveillance: Strengthen access control and monitoring systems in environments with mask usage

Secure & Ethical Collection
•
Data Security: Collected and processed securely on FutureBeeAI’s proprietary platform

•
Ethical Compliance: Follows strict guidelines for participant privacy and informed consent

•
Transparent Participation: All contributors provided written consent and were informed of the intended use

Dataset
Large-scale Labeled Faces (LSLF) Dataset.zip
figshare.com
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tarik Alafif; Zeyad Hailat; Melih Aslan; Xuewen Chen (2023). Large-scale Labeled Faces (LSLF) Dataset.zip [Dataset]. http://doi.org/10.6084/m9.figshare.13077329.v1
Explore at:
Unique identifier
https://doi.org/10.6084/m9.figshare.13077329.v1
Dataset updated
Jun 1, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Tarik Alafif; Zeyad Hailat; Melih Aslan; Xuewen Chen
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Our LSLF dataset consists of 1,195,976 labeled face images for 11,459 individuals. These images are stored in JPEG format with a total size of 5.36 GB. Individuals have a minimum of 1 face image and a maximum of 1,157 face images. The average number of face images per individual is 104. Each image is automatically named as (PersonName VideoNumber FrameNumber ImageNuumber) and stored in the related individual folder.
Similar Face Dataset (SFD)
figshare.com
zip
Updated Jan 15, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AnPing Song (2020). Similar Face Dataset (SFD) [Dataset]. http://doi.org/10.6084/m9.figshare.11611071.v3
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.11611071.v3
Dataset updated
Jan 15, 2020
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
AnPing Song
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Similar face recognition has always been one of the most challenging research directions in face recognition.This project shared similar face images (SFD.zip) that we have collected so far. All images are labeld and collected from publicly available datasets such as LFW, CASIA-WebFace.We will continue to collect larger-scale data and continue to update this project.Because the data set is too large, we uploaded a compressed zip file (SFD.zip). Meanwhile here we upload a few examples for everyone to view.email: ileven@shu.edu.cn
m
Data from: Pgu-Face: a dataset of partially covered facial images
data.mendeley.com
search.datacite.org
Updated Aug 24, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
seyed reza salari (2016). Pgu-Face: a dataset of partially covered facial images [Dataset]. http://doi.org/10.17632/znpyrgbfdr.1
Explore at:
Unique identifier
https://doi.org/10.17632/znpyrgbfdr.1
Dataset updated
Aug 24, 2016
Authors
seyed reza salari
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The Pgu-Face dataset contains 896 images from 224 different subjects. All of the subjects was Iranian men and most of them live in tropical regions of the southwest of Iran. The range of age of the subject's was 16 to 82 years with average 27.89 years. In addition, we make the following information available for the subjects: age and quality of the camera in mega pixels.
g
Faces: Age Detection from Images
gts.ai
csv, jpeg, json
Updated Mar 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Globose Technology Solutions Private Limited (2024). Faces: Age Detection from Images [Dataset]. https://gts.ai/dataset-download/faces-age-detection-from-images/
Explore at:
csv, json, jpegAvailable download formats
Dataset updated
Mar 28, 2024
Dataset authored and provided by
Globose Technology Solutions Private Limited
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
A large-scale dataset for age estimation from facial images, including Indian Movie Face Database (IMFDB) with 19,906 labeled images and UTKFace with over 20,000 images labeled with age, gender, and ethnicity. Useful for AI, biometrics, and facial recognition research.
SoloFace: A Single-Face Dataset for Resource-Constrained Face Detection and...
zenodo.org
zip
Updated Dec 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Riya Samanta; Riya Samanta; Bidyut Saha; Bidyut Saha (2024). SoloFace: A Single-Face Dataset for Resource-Constrained Face Detection and Tracking [Dataset]. http://doi.org/10.5281/zenodo.14474899
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.14474899
Dataset updated
Dec 15, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Riya Samanta; Riya Samanta; Bidyut Saha; Bidyut Saha
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
SoloFace: A Single-Face Dataset for Resource-Constrained Face Detection and Tracking

Description
SoloFace is a custom dataset derived from the COCO-Faces and Visual Wake Word datasets, specifically designed for single-face detection tasks in resource-constrained environments. This dataset is ideal for developing machine learning models for embedded AI applications, such as TinyML, which operate on low-power devices. Each image either contains a single human face or no face, with corresponding labels providing class information and bounding box coordinates for face detection. The dataset includes data augmentation to ensure robustness across diverse conditions, such as variations in lighting, scale, and orientation.

Dataset Structure
The dataset is organized into three subsets: train, test, and val. Each subset contains:

images/: .jpg image files.

labels/: .json label files with matching filenames to the images.

Label Format
Each .json label file includes:

image: Name of the corresponding image file.

class: 1 if a face is present, 0 otherwise.

bbox: Normalized bounding box coordinates [top_left_x, top_left_y, bottom_right_x, bottom_right_y]. If no face is present, the bounding box is set to [0.0, 0.0, 0.01, 0.01].

Statistics

Original Dataset:

Training images: 11,272

Testing images: 3,732

Validation images: 434

After Data Augmentation:

Training images: 56,360

Testing and validation images remain unchanged.

Class Distribution:

50% of images contain a single visible human face.

50% contain no human face.

Data Augmentation Details
To improve model robustness, the following augmentation techniques were applied to the training set:

Geometric Transformations: Random rotation (±15 degrees), scaling (±20%), and horizontal flipping (50%).

Color Transformations: Brightness and contrast adjustments (±30%).

Cropping: Random cropping up to 10% from image edges.

Each augmentation preserved bounding box consistency with the transformed images.

Usage This dataset supports the following use cases:

Training lightweight face detection models optimized for microcontroller deployment.

Benchmarking single-face detection models in resource-constrained environments.

Research on model robustness and efficiency.

Loading the Dataset

Download the dataset.

Extract the dataset using:
unzip soloface-detection-dataset.zip

Dataset structure:
soloface-detection-dataset/ ├── train/ │ ├── images/ │ ├── labels/ ├── test/ │ ├── images/ │ ├── labels/ ├── val/ │ ├── images/ │ ├── labels/

License
This dataset is released under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.

Permissions: Copy, distribute, and adapt for any purpose, including commercial.

Conditions: Provide proper attribution, a link to the license, and indicate changes.

Restrictions: No additional legal or technological restrictions.

For more details, visit the CC BY 4.0 License.

Contact
For inquiries or collaborations, please contact:

Bidyut Saha: sahabidyut999@gmail.com

Riya Samanta: study.riya1792@gmail.com

This format fits Zenodo's description field requirements while providing clarity and structure. Let me know if further refinements are needed!
110 People Face Image Dataset – Multi-Angle, Multi-Light, Multi-Expression,...
nexdata.ai
Updated Oct 21, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nexdata (2023). 110 People Face Image Dataset – Multi-Angle, Multi-Light, Multi-Expression, Annotated [Dataset]. https://www.nexdata.ai/datasets/computervision/4
Explore at:
Dataset updated
Oct 21, 2023
Dataset authored and provided by
Nexdata
Variables measured
Device, Accuracy, Data size, Data format, Data diversity, Age distribution, Race distribution, Gender distribution, Collecting environment
Description
The 110 People – Human Face Image Data is gathered through camera shot involving 110 participants, with a proper balance of gender ratio and age group distribution covering major skin tones. Each person contributes 2100 pictures with glasses/ no glasses, expressions, camera shooting angle, and lighting conditions. All Attributes are annotated such as gender, age, expression, etc. The overall accuracy rate is ≥ 97%.This dataset is suitable for face recognition, facial expression analysis, and AI training.
140k Real and Fake Faces
kaggle.com
zip
Updated Feb 10, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
xhlulu (2020). 140k Real and Fake Faces [Dataset]. https://www.kaggle.com/datasets/xhlulu/140k-real-and-fake-faces
Explore at:
zip(4024555718 bytes)Available download formats
Dataset updated
Feb 10, 2020
Authors
xhlulu
Description
This dataset consists of all 70k REAL faces from the Flickr dataset collected by Nvidia, as well as 70k fake faces sampled from the 1 Million FAKE faces (generated by StyleGAN) that was provided by Bojan.

In this dataset, I convenient combined both dataset, resized all the images into 256px, and split the data into train, validation and test set. I also included some CSV files for convenience.

For more details, check out those threads: * Thread for real faces dataset: https://www.kaggle.com/c/deepfake-detection-challenge/discussion/122786 * 1 Million Fake faces: https://www.kaggle.com/c/deepfake-detection-challenge/discussion/121173
F
Native American Occluded Facial Image Dataset
futurebeeai.com
wav
Updated Aug 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Native American Occluded Facial Image Dataset [Dataset]. https://www.futurebeeai.com/dataset/image-dataset/facial-images-occlusion-native-american
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Dataset funded by
FutureBeeAI
Description
Introduction
Welcome to the Native American Human Face with Occlusion Dataset, carefully curated to support the development of robust facial recognition systems, occlusion detection models, biometric identification technologies, and KYC verification tools. This dataset provides real-world variability by including facial images with common occlusions, helping AI models perform reliably under challenging conditions.
Facial Image Data
The dataset comprises over 3,000 high-quality facial images, organized into participant-wise sets. Each set includes:
•
Occluded Images: 5 images per individual featuring different types of facial occlusions, masks, caps, sunglasses, or combinations of these accessories

•
Normal Image: 1 reference image of the same individual without any occlusion

Diversity & Representation
•
Geographic Coverage: Participants from across USA, Canada, Mexico and more Native American countries

•
Demographics: Individuals aged 18 to 70 years, with a 60:40 male-to-female ratio

•
File Formats: Images available in JPEG and HEIC formats

Image Quality & Capture Conditions
To ensure robustness and real-world utility, images were captured under diverse conditions:
•
Lighting Variations: Includes both natural and artificial lighting scenarios

•
Background Diversity: Indoor and outdoor backgrounds for model generalization

•
Device Quality: Captured using the latest smartphones to ensure high resolution and consistency

Metadata
Each image is paired with detailed metadata to enable advanced filtering, model tuning, and analysis:
•Unique Participant ID
•File Name
•Age
•Gender
•Country
•Demographic Profile
•Type of Occlusion
•File Format
This rich metadata helps train models that can recognize faces even when partially obscured.
Use Cases & Applications
This dataset is ideal for a wide range of real-world and research-focused applications, including:
•
Facial Recognition under Occlusion: Improve model performance when faces are partially hidden

•
Occlusion Detection: Train systems to detect and classify facial accessories like masks or sunglasses

•
Biometric Identity Systems: Enhance verification accuracy across varying conditions

•
KYC & Compliance: Support face matching even when the selfie includes common occlusions.

•
Security & Surveillance: Strengthen access control and monitoring systems in environments with mask usage

Secure & Ethical Collection
•
Data Security: Collected and processed securely on FutureBeeAI’s proprietary platform

•
Ethical Compliance: Follows strict guidelines for participant privacy and informed consent

•
Transparent Participation: All contributors provided written consent and were informed of the intended use

Dataset Updates &
F
South Asian Children Facial Image Dataset for Facial Recognition
futurebeeai.com
wav
Updated Aug 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). South Asian Children Facial Image Dataset for Facial Recognition [Dataset]. https://www.futurebeeai.com/dataset/image-dataset/facial-images-minor-south-asian
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Area covered
South Asia
Dataset funded by
FutureBeeAI
Description
Introduction
The South Asian Children Facial Image Dataset is a thoughtfully curated collection designed to support the development of advanced facial recognition systems, biometric identity verification, age estimation tools, and child-specific AI models. This dataset enables researchers and developers to build highly accurate, inclusive, and ethically sourced AI solutions for real-world applications.
Facial Image Data
The dataset includes over 1500 high-resolution image sets of children under the age of 18. Each participant contributes approximately 15 unique facial images, captured to reflect natural variations in appearance and context.
Diversity and Representation
•
Geographic Coverage: Children from India, Pakistan, Bangladesh, Nepal, Sri Lanka, Bhutan, Maldives, and more

•
Age Group: All participants are minors, with a wide age spread across childhood and adolescence.

•
Gender Balance: Includes both boys and girls, representing a balanced gender distribution.

•
File Formats: Images are available in JPEG and HEIC formats.

Quality and Image Conditions
To ensure robust model training and generalizability, images are captured under varied natural conditions:
•
Lighting: A mix of lighting setups, including indoor, outdoor, bright, and low-light scenarios.

•
Backgrounds: Diverse backgrounds—plain, natural, and everyday environments—are included to promote realism.

•
Capture Devices: All photos are taken using modern mobile devices, ensuring high resolution and sharp detail.

Metadata
Each child’s image set is paired with detailed, structured metadata, enabling granular control and filtering during model training:
•Unique Participant ID
•File Name
•Age
•Gender
•Country
•Demographic Attributes
•File Format
This metadata is essential for applications that require demographic awareness, such as region-specific facial recognition or bias mitigation in AI models.
Applications
This dataset is ideal for a wide range of computer vision use cases, including:
•
Facial Recognition: Improving identification accuracy across diverse child demographics.

•
KYC and Identity Verification: Enabling more inclusive onboarding processes for child-specific platforms.

•
Biometric Systems: Supporting child-focused identity verification in education, healthcare, or travel.

•
Age Estimation: Training AI models to estimate age ranges of children from facial features.

•
Child Safety Models: Assisting in missing child identification or online content moderation.

•
Generative AI Training: Creating more representative synthetic data using real-world diverse inputs.

Ethical Collection and Data Security
We maintain the highest ethical and security standards throughout the data lifecycle:
•
Guardian Consent: Every participant’s guardian provided informed, written consent, clearly outlining the dataset’s use cases.

•
Privacy-First Approach: Personally identifiable information is not shared. Only anonymized metadata is included.

•
Secure Storage: <span
b
BioID Face Database
bioid.com
Updated Oct 12, 2011
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BioID (2011). BioID Face Database [Dataset]. https://www.bioid.com/face-database/
Explore at:
text/csv+zip, text//x-portable-graymap+zipAvailable download formats
Dataset updated
Oct 12, 2011
Dataset authored and provided by
BioID
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Pixel
Description
The BioID Face Database has been recorded and is published to give all researchers working in the area of face detection the possibility to compare the quality of their face detection algorithms with others. During the recording special emphasis has been laid on real world conditions. Therefore the testset features a large variety of illumination, background and face size. The dataset consists of 1521 gray level images with a resolution of 384x286 pixel. Each one shows the frontal view of a face of one out of 23 different test persons. For comparison reasons the set also contains manually set eye postions. The images are labeled BioID_xxxx.pgm where the characters xxxx are replaced by the index of the current image (with leading zeros). Similar to this, the files BioID_xxxx.eye contain the eye positions for the corresponding images.
g
Tufts Face Database
gts.ai
json
Updated Dec 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED (2023). Tufts Face Database [Dataset]. https://gts.ai/dataset-download/tufts-face-database-ai-data-collection-company/
Explore at:
jsonAvailable download formats
Dataset updated
Dec 3, 2023
Dataset authored and provided by
GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The Tufts Face Database is a comprehensive collection of human face images, ideal for facial recognition, biometric verification, and computer vision model training. It includes diverse data by ethnicity, age, gender, and region for robust AI development.
g
Open Celebrity Faces Dataset
gts.ai
json
Updated May 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GTS (2024). Open Celebrity Faces Dataset [Dataset]. https://gts.ai/dataset-download/open-celebrity-faces-dataset/
Explore at:
jsonAvailable download formats
Dataset updated
May 26, 2024
Dataset provided by
GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
Authors
GTS
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The Open Celebrity Faces Dataset is built for evaluating and advancing face reidentification and recognition algorithms. Featuring 258 categories of celebrity images across various ages, resolutions, and conditions, it is ideal for machine learning applications in security, media, and entertainment.

Facebook

Twitter

Click to copy link

Link copied

Cite

Kaustubh Dhote (2024). Human Faces Dataset [Dataset]. https://www.kaggle.com/datasets/kaustubhdhote/human-faces-dataset

Data from: Human Faces Dataset

Real and AI-generated Human Face Images (around 5k each)

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Aug 26, 2024

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Kaustubh Dhote

License

Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically

Description

The dataset contains around 9.6k images of human faces which are both real images and those generated by AI.

The zip contains two folders: - Real Images: 5000 images of real human faces - AI-Generated Images: 4630 images of ai-generated human faces.

Clear search

Close search

Google apps

Main menu

Data from: Human Faces Dataset

Custom Face Recognition Image Dataset

Image Dataset of face images for compuer vision tasks

Metadata for the dataset

💵 Buy the Dataset: This is a limited preview of the data. To access the full dataset, please contact us at https://unidata.pro to discuss your requirements and pricing options.

🌐 UniData provides high-quality datasets, content moderation, data collection and annotation for your AI/ML projects

Male Faces - Image Dataset

Face Recognition, Face Detection, Male Photo Dataset 👨

The dataset is created on the basis of Selfies and ID Dataset

👉 Legally sourced datasets and carefully structured for AI training and model development. Explore samples from our dataset of 95,000+ human images & videos - Full dataset

Metadata for the dataset:

Statistics for the dataset

🧩 This is just an example of the data. Leave a request here to learn more

Content

File with the extension .csv

AI-Face-Dataset-3000_Images

Face Re-identification Image Dataset

Dataset of face images with different angles and head positions

Metadata for the dataset

💵 Buy the Dataset: This is a limited preview of the data. To access the full dataset, please contact us at https://unidata.pro to discuss your requirements and pricing options.

🌐 UniData provides high-quality datasets, content moderation, data collection and annotation for your AI/ML projects

Human faces and object dataset

50K Celebrity Faces Image Dataset

Middle Eastern Occluded Facial Image Dataset

Introduction

Facial Image Data

Diversity & Representation

Image Quality & Capture Conditions

Metadata

Use Cases & Applications

Secure & Ethical Collection

Dataset

Large-scale Labeled Faces (LSLF) Dataset.zip

Similar Face Dataset (SFD)

Data from: Pgu-Face: a dataset of partially covered facial images

Faces: Age Detection from Images

SoloFace: A Single-Face Dataset for Resource-Constrained Face Detection and...

110 People Face Image Dataset – Multi-Angle, Multi-Light, Multi-Expression,...

140k Real and Fake Faces

Native American Occluded Facial Image Dataset

Introduction

Facial Image Data

Diversity & Representation

Image Quality & Capture Conditions

Metadata

Use Cases & Applications

Secure & Ethical Collection

Dataset Updates &

South Asian Children Facial Image Dataset for Facial Recognition

Introduction

Facial Image Data

Diversity and Representation

Quality and Image Conditions

Metadata

Applications

Ethical Collection and Data Security

BioID Face Database

Tufts Face Database

Open Celebrity Faces Dataset

Data from: Human Faces Dataset

Real and AI-generated Human Face Images (around 5k each)