Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
The dataset contains around 9.6k images of human faces which are both real images and those generated by AI.
The zip contains two folders: - Real Images: 5000 images of real human faces - AI-Generated Images: 4630 images of ai-generated human faces.
Facebook
TwitterAttribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Dataset comprises 500,600+ images of individuals representing various races, genders, and ages, with each person having a single face image. It is designed for facial recognition and face detection research, supporting the development of advanced recognition systems.
By leveraging this dataset, researchers and developers can enhance deep learning models, improve face verification and face identification techniques, and refine detection algorithms for more accurate recognizing faces in real-world scenarios. - Get the data
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F22059654%2F87acb75b060abcd7838e8a9fad21fb79%2FFrame%201%20(8).png?generation=1743153407873743&alt=media" alt="">
All images come with rigorously verified metadata annotations (age, gender, ethnicity), achieving ā„95% labeling accuracy. Also images are captured under different lighting conditions and resolutions, enhancing the dataset's utility for computer vision tasks and image classifications.
Researchers can leverage this dataset to improve recognition technology and develop learning models that enhance the accuracy of face detections. The dataset also supports projects focused on face anti-spoofing and deep learning applications, making it an essential tool for those studying biometric security and liveness detection technologies.
Facebook
TwitterAttribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
110,000+ photos of 74,000+ men from 141 countries. The dataset includes photos of people's faces. All people presented in the dataset are men. The dataset contains a variety of images capturing individuals from diverse backgrounds and age groups.
Our dataset will diversify your data by adding more photos of men of different ages and ethnic groups, enhancing the quality of your model.
People in the dataset
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2F4b36d906144803b5f4b1fb6bbb17246c%2FFrame%20109.png?generation=1714650925000102&alt=media" alt="">
The dataset can be utilized for a wide range of tasks, including face recognition, age estimation, image feature extraction, or any problem related to human image analysis.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2Fc7e8e8029a7e65e7f6f2ccc53e1b6f5d%2FMale%20Images.png?generation=1714650553018057&alt=media" alt="">
The dataset consists of: - files - includes 20 images corresponding to each person in the sample, - .csv file - contains information about the images and people in the dataset
š You can learn more about our high-quality unique datasets here
keywords: biometric system, biometric dataset, face recognition database, face recognition dataset, face detection dataset, facial analysis, object detection dataset, deep learning datasets, computer vision datset, human images dataset, human faces dataset, machine learning, image-to-image, verification models, digital photo-identification, men images, males dataset, male selfie, male face recognition
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset is a curated subset of 3000 images extracted from a larger collection of approximately 80,000 AI-generated faces. It features diverse, synthetic facial images created using advanced generative models, each with unique characteristics and expressions. Designed for focused testing and smaller-scale machine learning tasks, this subset offers a manageable sample size for experimentation with facial recognition and model validation. For broader applications and comprehensive studies, refer to the full dataset available at Original Dataset.
To access images in the ai-face-dataset-3000-images directory on Kaggle, list the files using os.listdir('/kaggle/input/ai-face-dataset-3000-images'). You can then load and process an image using libraries like PIL with Image.open('/kaggle/input/ai-face-dataset-3000-images/your-image-file.jpg').
Facebook
TwitterAttribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Dataset contains 23,110 individuals, each contributing 28 images featuring various angles and head positions, diverse backgrounds, and attributes, along with 1 ID photo. In total, the dataset comprises over 670,000 images in formats such as JPG and PNG. It is designed to advance face recognition and facial recognition research, focusing on person re-identification and recognition systems.
By utilizing this dataset, researchers can explore various recognition applications, including face verification, face identification. - Get the data
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F22059654%2Fed374cc92b935b209749cb7b32fd41da%2FFrame%201%20(10).png?generation=1743160276352983&alt=media" alt="">
The accuracy of labels of face pose is more than 97%, ensuring reliable data for training and testing recognition algorithms.
Dataset includes high-quality images that capture human faces in different poses and expressions, allowing for comprehensive analysis in recognition tasks. It is particularly valuable for developing and evaluating deep learning models and computer vision techniques.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Dataset Description: Human Faces and Objects Dataset (HFO-5000) The Human Faces and Objects Dataset (HFO-5000) is a curated collection of 5,000 images, categorized into three distinct classes: male faces (1,500), female faces (1,500), and objects (2,000). This dataset is designed for machine learning and computer vision applications, including image classification, face detection, and object recognition. The dataset provides high-quality, labeled images with a structured CSV file for seamless integration into deep learning pipelines.
Column Description: The dataset is accompanied by a CSV file that contains essential metadata for each image. The CSV file includes the following columns: file_name: The name of the image file (e.g., image_001.jpg). label: The category of the image, with three possible values: "male" (for male face images) "female" (for female face images) "object" (for images of various objects) file_path: The full or relative path to the image file within the dataset directory.
Uniqueness and Key Features: 1) Balanced Distribution: The dataset maintains an even distribution of human faces (male and female) to minimize bias in classification tasks. 2) Diverse Object Selection: The object category consists of a wide variety of items, ensuring robustness in distinguishing between human and non-human entities. 3) High-Quality Images: The dataset consists of clear and well-defined images, suitable for both training and testing AI models. 4) Structured Annotations: The CSV file simplifies dataset management and integration into machine learning workflows. 5) Potential Use Cases: This dataset can be used for tasks such as gender classification, facial recognition benchmarking, human-object differentiation, and transfer learning applications.
Conclusion: The HFO-5000 dataset provides a well-structured, diverse, and high-quality set of labeled images that can be used for various computer vision tasks. Its balanced distribution of human faces and objects ensures fairness in training AI models, making it a valuable resource for researchers and developers. By offering structured metadata and a wide range of images, this dataset facilitates advancements in deep learning applications related to facial recognition and object classification.
Facebook
TwitterThis dataset is a curated subset of the CelebFaces Attributes (CelebA) Dataset, handpicked for deep learning tasks such as image synthesis and facial recognition. It includes 50,000 celebrity face images from diverse identities, covering a wide range of poses, backgrounds, and facial attributes. These images are suitable for experimenting with GANs, facial recognition models, and other machine learning tasks related to face analysis.
This dataset is perfect for hobbyists, researchers, and machine learning practitioners looking to experiment with a manageable yet diverse collection of celebrity face images.
Facebook
Twitterhttps://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Welcome to the Middle Eastern Human Face with Occlusion Dataset, carefully curated to support the development of robust facial recognition systems, occlusion detection models, biometric identification technologies, and KYC verification tools. This dataset provides real-world variability by including facial images with common occlusions, helping AI models perform reliably under challenging conditions.
The dataset comprises over 3,000 high-quality facial images, organized into participant-wise sets. Each set includes:
To ensure robustness and real-world utility, images were captured under diverse conditions:
Each image is paired with detailed metadata to enable advanced filtering, model tuning, and analysis:
This rich metadata helps train models that can recognize faces even when partially obscured.
This dataset is ideal for a wide range of real-world and research-focused applications, including:
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Our LSLF dataset consists of 1,195,976 labeled face images for 11,459 individuals. These images are stored in JPEG format with a total size of 5.36 GB. Individuals have a minimum of 1 face image and a maximum of 1,157 face images. The average number of face images per individual is 104. Each image is automatically named as (PersonName VideoNumber FrameNumber ImageNuumber) and stored in the related individual folder.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Similar face recognition has always been one of the most challenging research directions in face recognition.This project shared similar face images (SFD.zip) that we have collected so far. All images are labeld and collected from publicly available datasets such as LFW, CASIA-WebFace.We will continue to collect larger-scale data and continue to update this project.Because the data set is too large, we uploaded a compressed zip file (SFD.zip). Meanwhile here we upload a few examples for everyone to view.email: ileven@shu.edu.cn
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Pgu-Face dataset contains 896 images from 224 different subjects. All of the subjects was Iranian men and most of them live in tropical regions of the southwest of Iran. The range of age of the subject's was 16 to 82 years with average 27.89 years. In addition, we make the following information available for the subjects: age and quality of the camera in mega pixels.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
A large-scale dataset for age estimation from facial images, including Indian Movie Face Database (IMFDB) with 19,906 labeled images and UTKFace with over 20,000 images labeled with age, gender, and ethnicity. Useful for AI, biometrics, and facial recognition research.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
SoloFace: A Single-Face Dataset for Resource-Constrained Face Detection and Tracking
Description
SoloFace is a custom dataset derived from the COCO-Faces and Visual Wake Word datasets, specifically designed for single-face detection tasks in resource-constrained environments. This dataset is ideal for developing machine learning models for embedded AI applications, such as TinyML, which operate on low-power devices. Each image either contains a single human face or no face, with corresponding labels providing class information and bounding box coordinates for face detection. The dataset includes data augmentation to ensure robustness across diverse conditions, such as variations in lighting, scale, and orientation.
Dataset Structure
The dataset is organized into three subsets: train, test, and val. Each subset contains:
images/: .jpg image files.labels/: .json label files with matching filenames to the images.Label Format
Each .json label file includes:
image: Name of the corresponding image file.class: 1 if a face is present, 0 otherwise.bbox: Normalized bounding box coordinates [top_left_x, top_left_y, bottom_right_x, bottom_right_y]. If no face is present, the bounding box is set to [0.0, 0.0, 0.01, 0.01].Statistics
Original Dataset:
After Data Augmentation:
Class Distribution:
Data Augmentation Details
To improve model robustness, the following augmentation techniques were applied to the training set:
Each augmentation preserved bounding box consistency with the transformed images.
Usage This dataset supports the following use cases:
Loading the Dataset
unzip soloface-detection-dataset.zip
soloface-detection-dataset/
āāā train/
ā āāā images/
ā āāā labels/
āāā test/
ā āāā images/
ā āāā labels/
āāā val/
ā āāā images/
ā āāā labels/
License
This dataset is released under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.
For more details, visit the CC BY 4.0 License.
Contact
For inquiries or collaborations, please contact:
sahabidyut999@gmail.comstudy.riya1792@gmail.comThis format fits Zenodo's description field requirements while providing clarity and structure. Let me know if further refinements are needed!
Facebook
TwitterThe 110 People ā Human Face Image Data is gathered through camera shot involving 110 participants, with a proper balance of gender ratio and age group distribution covering major skin tones. Each person contributes 2100 pictures with glasses/ no glasses, expressions, camera shooting angle, and lighting conditions. All Attributes are annotated such as gender, age, expression, etc. The overall accuracy rate is ā„ 97%.This dataset is suitable for face recognition, facial expression analysis, and AI training.
Facebook
TwitterThis dataset consists of all 70k REAL faces from the Flickr dataset collected by Nvidia, as well as 70k fake faces sampled from the 1 Million FAKE faces (generated by StyleGAN) that was provided by Bojan.
In this dataset, I convenient combined both dataset, resized all the images into 256px, and split the data into train, validation and test set. I also included some CSV files for convenience.
For more details, check out those threads: * Thread for real faces dataset: https://www.kaggle.com/c/deepfake-detection-challenge/discussion/122786 * 1 Million Fake faces: https://www.kaggle.com/c/deepfake-detection-challenge/discussion/121173
Facebook
Twitterhttps://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Welcome to the Native American Human Face with Occlusion Dataset, carefully curated to support the development of robust facial recognition systems, occlusion detection models, biometric identification technologies, and KYC verification tools. This dataset provides real-world variability by including facial images with common occlusions, helping AI models perform reliably under challenging conditions.
The dataset comprises over 3,000 high-quality facial images, organized into participant-wise sets. Each set includes:
To ensure robustness and real-world utility, images were captured under diverse conditions:
Each image is paired with detailed metadata to enable advanced filtering, model tuning, and analysis:
This rich metadata helps train models that can recognize faces even when partially obscured.
This dataset is ideal for a wide range of real-world and research-focused applications, including:
Facebook
Twitterhttps://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
The South Asian Children Facial Image Dataset is a thoughtfully curated collection designed to support the development of advanced facial recognition systems, biometric identity verification, age estimation tools, and child-specific AI models. This dataset enables researchers and developers to build highly accurate, inclusive, and ethically sourced AI solutions for real-world applications.
The dataset includes over 1500 high-resolution image sets of children under the age of 18. Each participant contributes approximately 15 unique facial images, captured to reflect natural variations in appearance and context.
To ensure robust model training and generalizability, images are captured under varied natural conditions:
Each childās image set is paired with detailed, structured metadata, enabling granular control and filtering during model training:
This metadata is essential for applications that require demographic awareness, such as region-specific facial recognition or bias mitigation in AI models.
This dataset is ideal for a wide range of computer vision use cases, including:
We maintain the highest ethical and security standards throughout the data lifecycle:
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The BioID Face Database has been recorded and is published to give all researchers working in the area of face detection the possibility to compare the quality of their face detection algorithms with others. During the recording special emphasis has been laid on real world conditions. Therefore the testset features a large variety of illumination, background and face size. The dataset consists of 1521 gray level images with a resolution of 384x286 pixel. Each one shows the frontal view of a face of one out of 23 different test persons. For comparison reasons the set also contains manually set eye postions. The images are labeled BioID_xxxx.pgm where the characters xxxx are replaced by the index of the current image (with leading zeros). Similar to this, the files BioID_xxxx.eye contain the eye positions for the corresponding images.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The Tufts Face Database is a comprehensive collection of human face images, ideal for facial recognition, biometric verification, and computer vision model training. It includes diverse data by ethnicity, age, gender, and region for robust AI development.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The Open Celebrity Faces Dataset is built for evaluating and advancing face reidentification and recognition algorithms. Featuring 258 categories of celebrity images across various ages, resolutions, and conditions, it is ideal for machine learning applications in security, media, and entertainment.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
The dataset contains around 9.6k images of human faces which are both real images and those generated by AI.
The zip contains two folders: - Real Images: 5000 images of real human faces - AI-Generated Images: 4630 images of ai-generated human faces.