100+ datasets found
  1. Data from: Human Faces Dataset

    • kaggle.com
    Updated Aug 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kaustubh Dhote (2024). Human Faces Dataset [Dataset]. https://www.kaggle.com/datasets/kaustubhdhote/human-faces-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 26, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Kaustubh Dhote
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    The dataset contains around 9.6k images of human faces which are both real images and those generated by AI.

    The zip contains two folders: - Real Images: 5000 images of real human faces - AI-Generated Images: 4630 images of ai-generated human faces.

  2. h

    face-recognition-image-dataset

    • huggingface.co
    Updated Apr 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Unidata (2025). face-recognition-image-dataset [Dataset]. https://huggingface.co/datasets/UniDataPro/face-recognition-image-dataset
    Explore at:
    Dataset updated
    Apr 15, 2025
    Authors
    Unidata
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Image Dataset of face images for compuer vision tasks

    Dataset comprises 500,600+ images of individuals representing various races, genders, and ages, with each person having a single face image. It is designed for facial recognition and face detection research, supporting the development of advanced recognition systems. By leveraging this dataset, researchers and developers can enhance deep learning models, improve face verification and face identification techniques, and refine… See the full description on the dataset page: https://huggingface.co/datasets/UniDataPro/face-recognition-image-dataset.

  3. 50K Celebrity Faces Image Dataset

    • kaggle.com
    Updated Aug 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Farzad Nekouei (2023). 50K Celebrity Faces Image Dataset [Dataset]. https://www.kaggle.com/datasets/farzadnekouei/50k-celebrity-faces-image-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 3, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Farzad Nekouei
    Description

    This dataset is a curated subset of the CelebFaces Attributes (CelebA) Dataset, handpicked for deep learning tasks such as image synthesis and facial recognition. It includes 50,000 celebrity face images from diverse identities, covering a wide range of poses, backgrounds, and facial attributes. These images are suitable for experimenting with GANs, facial recognition models, and other machine learning tasks related to face analysis.

    This dataset is perfect for hobbyists, researchers, and machine learning practitioners looking to experiment with a manageable yet diverse collection of celebrity face images.

  4. h

    face-re-identification-image-dataset

    • huggingface.co
    Updated Mar 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Unidata (2025). face-re-identification-image-dataset [Dataset]. https://huggingface.co/datasets/UniDataPro/face-re-identification-image-dataset
    Explore at:
    Dataset updated
    Mar 30, 2025
    Authors
    Unidata
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Dataset of face images with different angles and head positions

    Dataset contains 23,110 individuals, each contributing 28 images featuring various angles and head positions, diverse backgrounds, and attributes, along with 1 ID photo. In total, the dataset comprises over 670,000 images in formats such as JPG and PNG. It is designed to advance face recognition and facial recognition research, focusing on person re-identification and recognition systems. By utilizing this dataset… See the full description on the dataset page: https://huggingface.co/datasets/UniDataPro/face-re-identification-image-dataset.

  5. h

    male-selfie-image-dataset

    • huggingface.co
    Updated May 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Unique Data (2024). male-selfie-image-dataset [Dataset]. https://huggingface.co/datasets/UniqueData/male-selfie-image-dataset
    Explore at:
    Dataset updated
    May 2, 2024
    Authors
    Unique Data
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Face Recognition, Face Detection, Male Photo Dataset 👨

      The dataset is created on the basis of Selfies and ID Dataset
    

    110,000+ photos of 74,000+ men from 141 countries. The dataset includes photos of people's faces. All people presented in the dataset are men. The dataset contains a variety of images capturing individuals from diverse backgrounds and age groups. Our dataset will diversify your data by adding more photos of men of different ages and ethnic groups… See the full description on the dataset page: https://huggingface.co/datasets/UniqueData/male-selfie-image-dataset.

  6. F

    South Asian Occluded Facial Image Dataset

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). South Asian Occluded Facial Image Dataset [Dataset]. https://www.futurebeeai.com/dataset/image-dataset/facial-images-occlusion-south-asian
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the South Asian Human Face with Occlusion Dataset, carefully curated to support the development of robust facial recognition systems, occlusion detection models, biometric identification technologies, and KYC verification tools. This dataset provides real-world variability by including facial images with common occlusions, helping AI models perform reliably under challenging conditions.

    Facial Image Data

    The dataset comprises over 5,000 high-quality facial images, organized into participant-wise sets. Each set includes:

    •
    Occluded Images: 5 images per individual featuring different types of facial occlusions, masks, caps, sunglasses, or combinations of these accessories
    •
    Normal Image: 1 reference image of the same individual without any occlusion

    Diversity & Representation

    •
    Geographic Coverage: Participants from across India, Pakistan, Bangladesh, Nepal, Sri Lanka, Bhutan, Maldives, and more South Asian countries
    •
    Demographics: Individuals aged 18 to 70 years, with a 60:40 male-to-female ratio
    •
    File Formats: Images available in JPEG and HEIC formats

    Image Quality & Capture Conditions

    To ensure robustness and real-world utility, images were captured under diverse conditions:

    •
    Lighting Variations: Includes both natural and artificial lighting scenarios
    •
    Background Diversity: Indoor and outdoor backgrounds for model generalization
    •
    Device Quality: Captured using the latest smartphones to ensure high resolution and consistency

    Metadata

    Each image is paired with detailed metadata to enable advanced filtering, model tuning, and analysis:

    •Unique Participant ID
    •File Name
    •Age
    •Gender
    •Country
    •Demographic Profile
    •Type of Occlusion
    •File Format

    This rich metadata helps train models that can recognize faces even when partially obscured.

    Use Cases & Applications

    This dataset is ideal for a wide range of real-world and research-focused applications, including:

    •
    Facial Recognition under Occlusion: Improve model performance when faces are partially hidden
    •
    Occlusion Detection: Train systems to detect and classify facial accessories like masks or sunglasses
    •
    Biometric Identity Systems: Enhance verification accuracy across varying conditions
    •
    KYC & Compliance: Support face matching even when the selfie includes common occlusions.
    •
    Security & Surveillance: Strengthen access control and monitoring systems in environments with mask usage

    Secure & Ethical Collection

    •
    Data Security: Collected and processed securely on FutureBeeAI’s proprietary platform
    •
    Ethical Compliance: Follows strict guidelines for participant privacy and informed consent
    •
    Transparent Participation: All contributors provided written consent and were informed of the intended use
    <h3 style="font-weight:

  7. Stable Diffusion Face Dataset

    • kaggle.com
    Updated Apr 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohannad Ayman Salah (2024). Stable Diffusion Face Dataset [Dataset]. https://www.kaggle.com/datasets/mohannadaymansalah/stable-diffusion-dataaaaaaaaa
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 23, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Mohannad Ayman Salah
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    About the images:

    Fake Ai generated Human faces using Stable Diffusion 1.5, 2.1, and SDXL 1.0 checkpoint. The main objective was to generate photos that were as realistic as possible, without any specific style, focusing mainly on the face.

    Fake Ai generated Human faces

    • Images in 512x512px resolution were generated using SD 1.5;
    • Images in 768x768px resolution were generated using SD 2.1;
    • Images in 1024x1024px resolution were generated using SD XL 1.0;

    More details on the images and the process of creating the images in the readme file.

    The data is not mine, the data is taken from a GitHub repository to a user named: tobecwb Repo link: https://github.com/tobecwb/stable-diffusion-face-dataset

  8. Human Face Image Matting (hair&faces)

    • kaggle.com
    Updated Apr 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    KUCEV ROMAN (2023). Human Face Image Matting (hair&faces) [Dataset]. https://www.kaggle.com/datasets/tapakah68/matting-hairfaces
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 24, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    KUCEV ROMAN
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Matting (hair&faces) - faces dataset

    Accurately estimated foreground object in images. Dataset for editing applications for creating visual effects.

    💴 For Commercial Usage: To discuss your requirements, learn about the price and buy the dataset, leave a request on roman@kucev.com to buy the dataset

    Content

    Includes 2 folders: - images - original images of faces - masks - matting masks for images

    💴 Buy the Dataset: This is just an example of the data. Leave a request on roman@kucev.com to discuss your requirements, learn about the price and buy the dataset.

    keywords: head segmentation dataset, face-generation, segmentation, human faces, portrait segmentation, human face extraction, image segmentation, annotation, biometric dataset, biometric data dataset, face recognition database, facial recognition, face forgery detection, face shape, ar, augmented reality, face detection dataset, facial analysis, human images dataset, hair segmentation, matting, image matting, computer vision, deep learning, potrait matting, natural image matting

  9. u

    Instagram Faces Image Dataset

    • unidata.pro
    jpg
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Unidata L.L.C-FZ, Instagram Faces Image Dataset [Dataset]. https://unidata.pro/datasets/instagram-faces-image/
    Explore at:
    jpgAvailable download formats
    Dataset authored and provided by
    Unidata L.L.C-FZ
    Description

    Instagram Faces Image dataset with diverse single-face images for facial recognition, anti-spoofing, and computer vision

  10. Large-scale Labeled Faces (LSLF) Dataset.zip

    • figshare.com
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tarik Alafif; Zeyad Hailat; Melih Aslan; Xuewen Chen (2023). Large-scale Labeled Faces (LSLF) Dataset.zip [Dataset]. http://doi.org/10.6084/m9.figshare.13077329.v1
    Explore at:
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Tarik Alafif; Zeyad Hailat; Melih Aslan; Xuewen Chen
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Our LSLF dataset consists of 1,195,976 labeled face images for 11,459 individuals. These images are stored in JPEG format with a total size of 5.36 GB. Individuals have a minimum of 1 face image and a maximum of 1,157 face images. The average number of face images per individual is 104. Each image is automatically named as (PersonName VideoNumber FrameNumber ImageNuumber) and stored in the related individual folder.

  11. g

    Face datasets

    • generated.photos
    Updated Jul 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Generated Media, Inc. (2024). Face datasets [Dataset]. https://generated.photos/datasets/faces
    Explore at:
    Dataset updated
    Jul 20, 2024
    Dataset authored and provided by
    Generated Media, Inc.
    Description

    AI-generated, high-quality face datasets. Based on model-released photos. Diverse expressions, ethnicities, and age groups. Excellent for face recognition and analysis projects.

  12. f

    Similar Face Dataset (SFD)

    • figshare.com
    zip
    Updated Jan 15, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AnPing Song (2020). Similar Face Dataset (SFD) [Dataset]. http://doi.org/10.6084/m9.figshare.11611071.v3
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 15, 2020
    Dataset provided by
    figshare
    Authors
    AnPing Song
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Similar face recognition has always been one of the most challenging research directions in face recognition.This project shared similar face images (SFD.zip) that we have collected so far. All images are labeld and collected from publicly available datasets such as LFW, CASIA-WebFace.We will continue to collect larger-scale data and continue to update this project.Because the data set is too large, we uploaded a compressed zip file (SFD.zip). Meanwhile here we upload a few examples for everyone to view.email: ileven@shu.edu.cn

  13. b

    BioID Face Database

    • bioid.com
    Updated Nov 15, 2006
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    BioID (2006). BioID Face Database [Dataset]. https://www.bioid.com/face-database/
    Explore at:
    text/csv+zip, text//x-portable-graymap+zipAvailable download formats
    Dataset updated
    Nov 15, 2006
    Dataset authored and provided by
    BioID
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Pixel
    Description

    The BioID Face Database has been recorded and is published to give all researchers working in the area of face detection the possibility to compare the quality of their face detection algorithms with others. During the recording special emphasis has been laid on real world conditions. Therefore the testset features a large variety of illumination, background and face size. The dataset consists of 1521 gray level images with a resolution of 384x286 pixel. Each one shows the frontal view of a face of one out of 23 different test persons. For comparison reasons the set also contains manually set eye postions. The images are labeled BioID_xxxx.pgm where the characters xxxx are replaced by the index of the current image (with leading zeros). Similar to this, the files BioID_xxxx.eye contain the eye positions for the corresponding images.

  14. F

    Caucasian Occluded Facial Image Dataset

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Caucasian Occluded Facial Image Dataset [Dataset]. https://www.futurebeeai.com/dataset/image-dataset/facial-images-occlusion-caucasian
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the Caucasian Human Face with Occlusion Dataset, carefully curated to support the development of robust facial recognition systems, occlusion detection models, biometric identification technologies, and KYC verification tools. This dataset provides real-world variability by including facial images with common occlusions, helping AI models perform reliably under challenging conditions.

    Facial Image Data

    The dataset comprises over 3,000 high-quality facial images, organized into participant-wise sets. Each set includes:

    •
    Occluded Images: 5 images per individual featuring different types of facial occlusions, masks, caps, sunglasses, or combinations of these accessories
    •
    Normal Image: 1 reference image of the same individual without any occlusion

    Diversity & Representation

    •
    Geographic Coverage: Participants from across Spain, Italy, Turkey, Germany, France, and more Caucasian countries
    •
    Demographics: Individuals aged 18 to 70 years, with a 60:40 male-to-female ratio
    •
    File Formats: Images available in JPEG and HEIC formats

    Image Quality & Capture Conditions

    To ensure robustness and real-world utility, images were captured under diverse conditions:

    •
    Lighting Variations: Includes both natural and artificial lighting scenarios
    •
    Background Diversity: Indoor and outdoor backgrounds for model generalization
    •
    Device Quality: Captured using the latest smartphones to ensure high resolution and consistency

    Metadata

    Each image is paired with detailed metadata to enable advanced filtering, model tuning, and analysis:

    •Unique Participant ID
    •File Name
    •Age
    •Gender
    •Country
    •Demographic Profile
    •Type of Occlusion
    •File Format

    This rich metadata helps train models that can recognize faces even when partially obscured.

    Use Cases & Applications

    This dataset is ideal for a wide range of real-world and research-focused applications, including:

    •
    Facial Recognition under Occlusion: Improve model performance when faces are partially hidden
    •
    Occlusion Detection: Train systems to detect and classify facial accessories like masks or sunglasses
    •
    Biometric Identity Systems: Enhance verification accuracy across varying conditions
    •
    KYC & Compliance: Support face matching even when the selfie includes common occlusions.
    •
    Security & Surveillance: Strengthen access control and monitoring systems in environments with mask usage

    Secure & Ethical Collection

    •
    Data Security: Collected and processed securely on FutureBeeAI’s proprietary platform
    •
    Ethical Compliance: Follows strict guidelines for participant privacy and informed consent
    •
    Transparent Participation: All contributors provided written consent and were informed of the intended use

    Dataset Updates &

  15. g

    Faces: Age Detection from Images

    • gts.ai
    csv, jpeg, json
    Updated Mar 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Globose Technology Solutions Private Limited (2024). Faces: Age Detection from Images [Dataset]. https://gts.ai/dataset-download/faces-age-detection-from-images/
    Explore at:
    csv, json, jpegAvailable download formats
    Dataset updated
    Mar 28, 2024
    Dataset authored and provided by
    Globose Technology Solutions Private Limited
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    A large-scale dataset for age estimation from facial images, including Indian Movie Face Database (IMFDB) with 19,906 labeled images and UTKFace with over 20,000 images labeled with age, gender, and ethnicity. Useful for AI, biometrics, and facial recognition research.

  16. m

    Data from: Pgu-Face: a dataset of partially covered facial images

    • data.mendeley.com
    • search.datacite.org
    Updated Aug 24, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    seyed reza salari (2016). Pgu-Face: a dataset of partially covered facial images [Dataset]. http://doi.org/10.17632/znpyrgbfdr.1
    Explore at:
    Dataset updated
    Aug 24, 2016
    Authors
    seyed reza salari
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Pgu-Face dataset contains 896 images from 224 different subjects. All of the subjects was Iranian men and most of them live in tropical regions of the southwest of Iran. The range of age of the subject's was 16 to 82 years with average 27.89 years. In addition, we make the following information available for the subjects: age and quality of the camera in mega pixels.

  17. a

    Labeled Faces in the Wild aligned (LFW-a)

    • academictorrents.com
    bittorrent
    Updated Nov 26, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yaniv Taigman and Lior Wolf and Tal Hassner (2015). Labeled Faces in the Wild aligned (LFW-a) [Dataset]. https://academictorrents.com/details/403e6d6945a64dd1b9e185a6cd8d029274efccdc
    Explore at:
    bittorrent(96770694)Available download formats
    Dataset updated
    Nov 26, 2015
    Dataset authored and provided by
    Yaniv Taigman and Lior Wolf and Tal Hassner
    License

    https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified

    Description

    The "Labeled Faces in the Wild-a" image collection is a database of labeled, face images intended for studying Face Recognition in unconstrained images. It contains the same images available in the original Labeled Faces in the Wild data set, however, here we provide them after alignment using a commercial face alignment software. Some of our results, published in [1,2,3], were produced using these images. We show this alignment to improve the performance of face recognition algorithms. More information on how these images were aligned may be found in the two papers. We have maintained the same directory structure as in the original LFW data set, and so these images can be used as direct substitutes for those in the original image set. Note, however, that the images available here are grayscale versions of the originals. Citation: If you find these images useful and use them in your work, please follow these guidlines: Comply with any instructions specified for the original L

  18. h

    infrared-face-recognition-dataset

    • huggingface.co
    Updated Mar 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Unidata (2025). infrared-face-recognition-dataset [Dataset]. https://huggingface.co/datasets/UniDataPro/infrared-face-recognition-dataset
    Explore at:
    Dataset updated
    Mar 18, 2025
    Authors
    Unidata
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Infrared Face Detection Dataset

    Dataset contains 125,500+ images, including infrared images, from 4,484 individuals with or without a mask of various races, genders, and ages. It is specifically designed for research in face recognition and facial recognition technology, focusing on the unique challenges posed by thermal infrared imaging. By utilizing this dataset, researchers and developers can enhance their understanding of recognition systems and improve the recognition accuracy… See the full description on the dataset page: https://huggingface.co/datasets/UniDataPro/infrared-face-recognition-dataset.

  19. SoloFace: A Single-Face Dataset for Resource-Constrained Face Detection and...

    • zenodo.org
    zip
    Updated Dec 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Riya Samanta; Riya Samanta; Bidyut Saha; Bidyut Saha (2024). SoloFace: A Single-Face Dataset for Resource-Constrained Face Detection and Tracking [Dataset]. http://doi.org/10.5281/zenodo.14474899
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 15, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Riya Samanta; Riya Samanta; Bidyut Saha; Bidyut Saha
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    SoloFace: A Single-Face Dataset for Resource-Constrained Face Detection and Tracking

    Description
    SoloFace is a custom dataset derived from the COCO-Faces and Visual Wake Word datasets, specifically designed for single-face detection tasks in resource-constrained environments. This dataset is ideal for developing machine learning models for embedded AI applications, such as TinyML, which operate on low-power devices. Each image either contains a single human face or no face, with corresponding labels providing class information and bounding box coordinates for face detection. The dataset includes data augmentation to ensure robustness across diverse conditions, such as variations in lighting, scale, and orientation.

    Dataset Structure
    The dataset is organized into three subsets: train, test, and val. Each subset contains:

    • images/: .jpg image files.
    • labels/: .json label files with matching filenames to the images.

    Label Format
    Each .json label file includes:

    • image: Name of the corresponding image file.
    • class: 1 if a face is present, 0 otherwise.
    • bbox: Normalized bounding box coordinates [top_left_x, top_left_y, bottom_right_x, bottom_right_y]. If no face is present, the bounding box is set to [0.0, 0.0, 0.01, 0.01].

    Statistics

    • Original Dataset:

      • Training images: 11,272
      • Testing images: 3,732
      • Validation images: 434
    • After Data Augmentation:

      • Training images: 56,360
      • Testing and validation images remain unchanged.
    • Class Distribution:

      • 50% of images contain a single visible human face.
      • 50% contain no human face.

    Data Augmentation Details
    To improve model robustness, the following augmentation techniques were applied to the training set:

    1. Geometric Transformations: Random rotation (±15 degrees), scaling (±20%), and horizontal flipping (50%).
    2. Color Transformations: Brightness and contrast adjustments (±30%).
    3. Cropping: Random cropping up to 10% from image edges.

    Each augmentation preserved bounding box consistency with the transformed images.

    Usage This dataset supports the following use cases:

    1. Training lightweight face detection models optimized for microcontroller deployment.
    2. Benchmarking single-face detection models in resource-constrained environments.
    3. Research on model robustness and efficiency.

    Loading the Dataset

    1. Download the dataset.
    2. Extract the dataset using:
      unzip soloface-detection-dataset.zip
      
    3. Dataset structure:
      soloface-detection-dataset/
      ├── train/
      │  ├── images/
      │  ├── labels/
      ├── test/
      │  ├── images/
      │  ├── labels/
      ├── val/
      │  ├── images/
      │  ├── labels/
      

    License
    This dataset is released under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.

    • Permissions: Copy, distribute, and adapt for any purpose, including commercial.
    • Conditions: Provide proper attribution, a link to the license, and indicate changes.
    • Restrictions: No additional legal or technological restrictions.

    For more details, visit the CC BY 4.0 License.

    Contact
    For inquiries or collaborations, please contact:

    • Bidyut Saha: sahabidyut999@gmail.com
    • Riya Samanta: study.riya1792@gmail.com

    This format fits Zenodo's description field requirements while providing clarity and structure. Let me know if further refinements are needed!

  20. 110 People Face Image Dataset – Multi-Angle, Multi-Light, Multi-Expression,...

    • nexdata.ai
    Updated Oct 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2023). 110 People Face Image Dataset – Multi-Angle, Multi-Light, Multi-Expression, Annotated [Dataset]. https://www.nexdata.ai/datasets/computervision/4
    Explore at:
    Dataset updated
    Oct 21, 2023
    Dataset authored and provided by
    Nexdata
    Variables measured
    Device, Accuracy, Data size, Data format, Data diversity, Age distribution, Race distribution, Gender distribution, Collecting environment
    Description

    The 110 People – Human Face Image Data is gathered through camera shot involving 110 participants, with a proper balance of gender ratio and age group distribution covering major skin tones. Each person contributes 2100 pictures with glasses/ no glasses, expressions, camera shooting angle, and lighting conditions. All Attributes are annotated such as gender, age, expression, etc. The overall accuracy rate is ≥ 97%.This dataset is suitable for face recognition, facial expression analysis, and AI training.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Kaustubh Dhote (2024). Human Faces Dataset [Dataset]. https://www.kaggle.com/datasets/kaustubhdhote/human-faces-dataset
Organization logo

Data from: Human Faces Dataset

Real and AI-generated Human Face Images (around 5k each)

Related Article
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 26, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Kaustubh Dhote
License

Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically

Description

The dataset contains around 9.6k images of human faces which are both real images and those generated by AI.

The zip contains two folders: - Real Images: 5000 images of real human faces - AI-Generated Images: 4630 images of ai-generated human faces.

Search
Clear search
Close search
Google apps
Main menu