100+ datasets found
  1. h

    wider_face

    • huggingface.co
    • opendatalab.com
    • +4more
    Updated Jan 13, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Chinese University of Hong Kong (2022). wider_face [Dataset]. https://huggingface.co/datasets/CUHK-CSE/wider_face
    Explore at:
    Dataset updated
    Jan 13, 2022
    Dataset authored and provided by
    The Chinese University of Hong Kong
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    WIDER FACE dataset is a face detection benchmark dataset, of which images are selected from the publicly available WIDER dataset. We choose 32,203 images and label 393,703 faces with a high degree of variability in scale, pose and occlusion as depicted in the sample images. WIDER FACE dataset is organized based on 61 event classes. For each event class, we randomly select 40%/10%/50% data as training, validation and testing sets. We adopt the same evaluation metric employed in the PASCAL VOC dataset. Similar to MALF and Caltech datasets, we do not release bounding box ground truth for the test images. Users are required to submit final prediction files, which we shall proceed to evaluate.

  2. Face Detection Dataset

    • kaggle.com
    Updated Dec 30, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sudhanshu Rastogi (2024). Face Detection Dataset [Dataset]. https://www.kaggle.com/datasets/sudhanshu2198/face-detection-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 30, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sudhanshu Rastogi
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This Dataset is created by organizing the WIDER FACE dataset. WIDER FACE dataset is a face detection benchmark dataset, of which images are selected from the publicly available WIDER dataset. We chose 32,203 images and labeled 393,703 faces with a high degree of variability in scale, pose, and occlusion as depicted in the sample images. WIDER FACE dataset is organized based on 61 event classes. For each event class, we randomly select 40%/10%/50% of data as training, validation, and testing sets. We adopt the same evaluation metric employed in the PASCAL VOC dataset.

    Original Dataset http://shuoyang1213.me/WIDERFACE/

  3. Performance levels for the individual and average-image targets.

    • plos.figshare.com
    xls
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David J. Robertson; Robin S. S. Kramer; A. Mike Burton (2023). Performance levels for the individual and average-image targets. [Dataset]. http://doi.org/10.1371/journal.pone.0119460.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    David J. Robertson; Robin S. S. Kramer; A. Mike Burton
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Note. For each celebrity, individual-image targets were tested 50 times (5 targets x 5 test images for both ‘users’ and ‘imposters’), while average-image targets were tested 10 times (average target x 5 test images for both ‘users’ and ‘imposters’). The table shows mean performance by condition across all celebrities (SD in parentheses).Performance levels for the individual and average-image targets.

  4. Face Mask Segmentation - WIDER Face Dataset

    • kaggle.com
    zip
    Updated Jul 14, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vinayak Shanawad (2021). Face Mask Segmentation - WIDER Face Dataset [Dataset]. https://www.kaggle.com/vinayakshanawad/face-mask-segmentation-wider-face-dataset
    Explore at:
    zip(833238739 bytes)Available download formats
    Dataset updated
    Jul 14, 2021
    Authors
    Vinayak Shanawad
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Project Description

    The goal is to build a Face Mask Segmentation model which includes building a face detector to locate the position of a face in an image.

    Data Description

    WIDER Face Dataset WIDER FACE dataset is a Face Mask Segmentation benchmark dataset, of which images are selected from the publicly available WIDER dataset. This data have 32,203 images and 393,703 faces are labeled with a high degree of variability in scale, pose and occlusion as depicted in the sample images. In this project, we are using 409 images and around 1000 faces for ease of computation.

    We will be using transfer learning on an already trained model to build our segmenter. We will perform transfer learning on the MobileNet model which is already trained to perform image segmentation. We will need to train the last 6-7 layers and freeze the remaining layers to train the model for face mask segmentation. To be able to train the MobileNet model for face mask segmentation, we will be using the WIDER FACE dataset for various images with a single face and multiple faces. The output of the model is the face mask segmented data which masks the face in an image. We learn to build a face mask segmentation model using Keras supported by Tensorflow.

    Reference

    Acknowledgment for the datasets. http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/ Mobile Net paper: https://arxiv.org/pdf/1704.04861.pdf

    Objective

    In this problem, we use "Transfer Learning" of an Image Segmentation model to detect any object according to the problem in hand. Here, we are particularly interested in segmenting faces in a given image.

  5. Error Rates in Users of Automatic Face Recognition Software

    • plos.figshare.com
    • figshare.com
    xlsx
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David White; James D. Dunn; Alexandra C. Schmid; Richard I. Kemp (2023). Error Rates in Users of Automatic Face Recognition Software [Dataset]. http://doi.org/10.1371/journal.pone.0139827
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    David White; James D. Dunn; Alexandra C. Schmid; Richard I. Kemp
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In recent years, wide deployment of automatic face recognition systems has been accompanied by substantial gains in algorithm performance. However, benchmarking tests designed to evaluate these systems do not account for the errors of human operators, who are often an integral part of face recognition solutions in forensic and security settings. This causes a mismatch between evaluation tests and operational accuracy. We address this by measuring user performance in a face recognition system used to screen passport applications for identity fraud. Experiment 1 measured target detection accuracy in algorithm-generated ‘candidate lists’ selected from a large database of passport images. Accuracy was notably poorer than in previous studies of unfamiliar face matching: participants made over 50% errors for adult target faces, and over 60% when matching images of children. Experiment 2 then compared performance of student participants to trained passport officers–who use the system in their daily work–and found equivalent performance in these groups. Encouragingly, a group of highly trained and experienced “facial examiners” outperformed these groups by 20 percentage points. We conclude that human performance curtails accuracy of face recognition systems–potentially reducing benchmark estimates by 50% in operational settings. Mere practise does not attenuate these limits, but superior performance of trained examiners suggests that recruitment and selection of human operators, in combination with effective training and mentorship, can improve the operational accuracy of face recognition systems.

  6. h

    flymyai-ffhq-edit-bench

    • huggingface.co
    Updated Jun 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FlyMy.AI (2025). flymyai-ffhq-edit-bench [Dataset]. https://huggingface.co/datasets/flymy-ai/flymyai-ffhq-edit-bench
    Explore at:
    Dataset updated
    Jun 20, 2025
    Dataset provided by
    FlyMy.AI
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Face Identity Preservation Benchmark

    A comprehensive evaluation dataset for face transformation APIs measuring identity preservation across complexity levels and transformation categories. 🔗 Complete Repository: https://github.com/FlyMyAI/bench_M1

      Dataset Summary
    

    This benchmark evaluates identity preservation in face image transformations using 8,832 transformation pairs across three major APIs. The dataset provides systematic evaluation of face editing quality using… See the full description on the dataset page: https://huggingface.co/datasets/flymy-ai/flymyai-ffhq-edit-bench.

  7. F

    Native American Occluded Facial Image Dataset

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Native American Occluded Facial Image Dataset [Dataset]. https://www.futurebeeai.com/dataset/image-dataset/facial-images-occlusion-native-american
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the Native American Human Face with Occlusion Dataset, carefully curated to support the development of robust facial recognition systems, occlusion detection models, biometric identification technologies, and KYC verification tools. This dataset provides real-world variability by including facial images with common occlusions, helping AI models perform reliably under challenging conditions.

    Facial Image Data

    The dataset comprises over 3,000 high-quality facial images, organized into participant-wise sets. Each set includes:

    Occluded Images: 5 images per individual featuring different types of facial occlusions, masks, caps, sunglasses, or combinations of these accessories
    Normal Image: 1 reference image of the same individual without any occlusion

    Diversity & Representation

    Geographic Coverage: Participants from across USA, Canada, Mexico and more Native American countries
    Demographics: Individuals aged 18 to 70 years, with a 60:40 male-to-female ratio
    File Formats: Images available in JPEG and HEIC formats

    Image Quality & Capture Conditions

    To ensure robustness and real-world utility, images were captured under diverse conditions:

    Lighting Variations: Includes both natural and artificial lighting scenarios
    Background Diversity: Indoor and outdoor backgrounds for model generalization
    Device Quality: Captured using the latest smartphones to ensure high resolution and consistency

    Metadata

    Each image is paired with detailed metadata to enable advanced filtering, model tuning, and analysis:

    Unique Participant ID
    File Name
    Age
    Gender
    Country
    Demographic Profile
    Type of Occlusion
    File Format

    This rich metadata helps train models that can recognize faces even when partially obscured.

    Use Cases & Applications

    This dataset is ideal for a wide range of real-world and research-focused applications, including:

    Facial Recognition under Occlusion: Improve model performance when faces are partially hidden
    Occlusion Detection: Train systems to detect and classify facial accessories like masks or sunglasses
    Biometric Identity Systems: Enhance verification accuracy across varying conditions
    KYC & Compliance: Support face matching even when the selfie includes common occlusions.
    Security & Surveillance: Strengthen access control and monitoring systems in environments with mask usage

    Secure & Ethical Collection

    Data Security: Collected and processed securely on FutureBeeAI’s proprietary platform
    Ethical Compliance: Follows strict guidelines for participant privacy and informed consent
    Transparent Participation: All contributors provided written consent and were informed of the intended use

    Dataset Updates &

  8. g

    LFW – Facial Recognition Dataset

    • gts.ai
    jpeg
    Updated Nov 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Globose Technology Solutions Private Limited (2023). LFW – Facial Recognition Dataset [Dataset]. https://gts.ai/dataset-download/lfw-dataset-premier-facial-recognition-for-ai-tools/
    Explore at:
    jpegAvailable download formats
    Dataset updated
    Nov 20, 2023
    Dataset authored and provided by
    Globose Technology Solutions Private Limited
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    LFW (Labeled Faces in the Wild) is a benchmark dataset for facial recognition research. It contains thousands of face images captured in real-world conditions and is primarily used for evaluating face verification and recognition algorithms.

  9. F

    African Occluded Facial Image Dataset

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). African Occluded Facial Image Dataset [Dataset]. https://www.futurebeeai.com/dataset/image-dataset/facial-images-occlusion-african
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the African Human Face with Occlusion Dataset, carefully curated to support the development of robust facial recognition systems, occlusion detection models, biometric identification technologies, and KYC verification tools. This dataset provides real-world variability by including facial images with common occlusions, helping AI models perform reliably under challenging conditions.

    Facial Image Data

    The dataset comprises over 5,000 high-quality facial images, organized into participant-wise sets. Each set includes:

    Occluded Images: 5 images per individual featuring different types of facial occlusions, masks, caps, sunglasses, or combinations of these accessories
    Normal Image: 1 reference image of the same individual without any occlusion

    Diversity & Representation

    Geographic Coverage: Participants from across Kenya, Malawi, Nigeria, Ethiopia, Benin, Somalia, Uganda, and more African countries
    Demographics: Individuals aged 18 to 70 years, with a 60:40 male-to-female ratio
    File Formats: Images available in JPEG and HEIC formats

    Image Quality & Capture Conditions

    To ensure robustness and real-world utility, images were captured under diverse conditions:

    Lighting Variations: Includes both natural and artificial lighting scenarios
    Background Diversity: Indoor and outdoor backgrounds for model generalization
    Device Quality: Captured using the latest smartphones to ensure high resolution and consistency

    Metadata

    Each image is paired with detailed metadata to enable advanced filtering, model tuning, and analysis:

    Unique Participant ID
    File Name
    Age
    Gender
    Country
    Demographic Profile
    Type of Occlusion
    File Format

    This rich metadata helps train models that can recognize faces even when partially obscured.

    Use Cases & Applications

    This dataset is ideal for a wide range of real-world and research-focused applications, including:

    Facial Recognition under Occlusion: Improve model performance when faces are partially hidden
    Occlusion Detection: Train systems to detect and classify facial accessories like masks or sunglasses
    Biometric Identity Systems: Enhance verification accuracy across varying conditions
    KYC & Compliance: Support face matching even when the selfie includes common occlusions.
    Security & Surveillance: Strengthen access control and monitoring systems in environments with mask usage

    Secure & Ethical Collection

    Data Security: Collected and processed securely on FutureBeeAI’s proprietary platform
    Ethical Compliance: Follows strict guidelines for participant privacy and informed consent
    Transparent Participation: All contributors provided written consent and were informed of the intended use

    Dataset

  10. F

    Middle Eastern Occluded Facial Image Dataset

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Middle Eastern Occluded Facial Image Dataset [Dataset]. https://www.futurebeeai.com/dataset/image-dataset/facial-images-occlusion-middle-east
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the Middle Eastern Human Face with Occlusion Dataset, carefully curated to support the development of robust facial recognition systems, occlusion detection models, biometric identification technologies, and KYC verification tools. This dataset provides real-world variability by including facial images with common occlusions, helping AI models perform reliably under challenging conditions.

    Facial Image Data

    The dataset comprises over 3,000 high-quality facial images, organized into participant-wise sets. Each set includes:

    Occluded Images: 5 images per individual featuring different types of facial occlusions, masks, caps, sunglasses, or combinations of these accessories
    Normal Image: 1 reference image of the same individual without any occlusion

    Diversity & Representation

    Geographic Coverage: Participants from across Egypt, Jordan, Suadi Arabia, UAE, Tunisia, and more Middle Eastern countries
    Demographics: Individuals aged 18 to 70 years, with a 60:40 male-to-female ratio
    File Formats: Images available in JPEG and HEIC formats

    Image Quality & Capture Conditions

    To ensure robustness and real-world utility, images were captured under diverse conditions:

    Lighting Variations: Includes both natural and artificial lighting scenarios
    Background Diversity: Indoor and outdoor backgrounds for model generalization
    Device Quality: Captured using the latest smartphones to ensure high resolution and consistency

    Metadata

    Each image is paired with detailed metadata to enable advanced filtering, model tuning, and analysis:

    Unique Participant ID
    File Name
    Age
    Gender
    Country
    Demographic Profile
    Type of Occlusion
    File Format

    This rich metadata helps train models that can recognize faces even when partially obscured.

    Use Cases & Applications

    This dataset is ideal for a wide range of real-world and research-focused applications, including:

    Facial Recognition under Occlusion: Improve model performance when faces are partially hidden
    Occlusion Detection: Train systems to detect and classify facial accessories like masks or sunglasses
    Biometric Identity Systems: Enhance verification accuracy across varying conditions
    KYC & Compliance: Support face matching even when the selfie includes common occlusions.
    Security & Surveillance: Strengthen access control and monitoring systems in environments with mask usage

    Secure & Ethical Collection

    Data Security: Collected and processed securely on FutureBeeAI’s proprietary platform
    Ethical Compliance: Follows strict guidelines for participant privacy and informed consent
    Transparent Participation: All contributors provided written consent and were informed of the intended use

    Dataset

  11. The average recognition rates (%) and the corresponding standard deviations...

    • plos.figshare.com
    xls
    Updated May 31, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jianzhong Wang; Yugen Yi; Wei Zhou; Yanjiao Shi; Miao Qi; Ming Zhang; Baoxue Zhang; Jun Kong (2023). The average recognition rates (%) and the corresponding standard deviations (%) of different algorithms on the test set of the AR face database with sunglasses and scarf occlusions (sub-image size 32×32). [Dataset]. http://doi.org/10.1371/journal.pone.0113198.t007
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Jianzhong Wang; Yugen Yi; Wei Zhou; Yanjiao Shi; Miao Qi; Ming Zhang; Baoxue Zhang; Jun Kong
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The average recognition rates (%) and the corresponding standard deviations (%) of different algorithms on the test set of the AR face database with sunglasses and scarf occlusions (sub-image size 32×32).

  12. m

    Human Faces and Objects Mix Image Dataset

    • data.mendeley.com
    Updated Mar 13, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bindu Garg (2025). Human Faces and Objects Mix Image Dataset [Dataset]. http://doi.org/10.17632/nzwvnrmwp3.1
    Explore at:
    Dataset updated
    Mar 13, 2025
    Authors
    Bindu Garg
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset Description: Human Faces and Objects Dataset (HFO-5000) The Human Faces and Objects Dataset (HFO-5000) is a curated collection of 5,000 images, categorized into three distinct classes: male faces (1,500), female faces (1,500), and objects (2,000). This dataset is designed for machine learning and computer vision applications, including image classification, face detection, and object recognition. The dataset provides high-quality, labeled images with a structured CSV file for seamless integration into deep learning pipelines.

    Column Description: The dataset is accompanied by a CSV file that contains essential metadata for each image. The CSV file includes the following columns: file_name: The name of the image file (e.g., image_001.jpg). label: The category of the image, with three possible values: "male" (for male face images) "female" (for female face images) "object" (for images of various objects) file_path: The full or relative path to the image file within the dataset directory.

    Uniqueness and Key Features: 1) Balanced Distribution: The dataset maintains an even distribution of human faces (male and female) to minimize bias in classification tasks. 2) Diverse Object Selection: The object category consists of a wide variety of items, ensuring robustness in distinguishing between human and non-human entities. 3) High-Quality Images: The dataset consists of clear and well-defined images, suitable for both training and testing AI models. 4) Structured Annotations: The CSV file simplifies dataset management and integration into machine learning workflows. 5) Potential Use Cases: This dataset can be used for tasks such as gender classification, facial recognition benchmarking, human-object differentiation, and transfer learning applications.

    Conclusion: The HFO-5000 dataset provides a well-structured, diverse, and high-quality set of labeled images that can be used for various computer vision tasks. Its balanced distribution of human faces and objects ensures fairness in training AI models, making it a valuable resource for researchers and developers. By offering structured metadata and a wide range of images, this dataset facilitates advancements in deep learning applications related to facial recognition and object classification.

  13. Face Recognition Dataset – 10,109 People with Multi-angle Face Images and...

    • nexdata.ai
    Updated Jun 14, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2024). Face Recognition Dataset – 10,109 People with Multi-angle Face Images and Demographic Labels [Dataset]. https://www.nexdata.ai/datasets/1402?source=Github
    Explore at:
    Dataset updated
    Jun 14, 2024
    Dataset authored and provided by
    Nexdata
    Variables measured
    Data size, Data format, Data diversity, Age distribution, Race distribution, Gender distribution, Collecting environment
    Description

    This large-scale face image dataset features 10,109 individuals from various countries and ethnic backgrounds. Each subject has been captured in multiple real-world scenarios, resulting in diverse facial images under varying angles, lighting conditions, and expressions. Detailed annotations include gender, race, and age, making the dataset suitable for tasks such as facial recognition, face clustering, demographic analysis, and machine learning model training.The dataset has been validated by multiple AI companies and proven to deliver strong performance in real-world applications. All data collection, storage, and processing strictly adhere to global data protection regulations, including GDPR, CCPA, and PIPL, ensuring legal compliance and privacy preservation.

  14. h

    Benchmark-Images-for-Stable-Diffusion-Bias

    • huggingface.co
    Updated Apr 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Reece Iriye (2024). Benchmark-Images-for-Stable-Diffusion-Bias [Dataset]. https://huggingface.co/datasets/ririye/Benchmark-Images-for-Stable-Diffusion-Bias
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 16, 2024
    Authors
    Reece Iriye
    Description

    ririye/Benchmark-Images-for-Stable-Diffusion-Bias dataset hosted on Hugging Face and contributed by the HF Datasets community

  15. WIDER FACE

    • kaggle.com
    zip
    Updated Jan 5, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Motaz Saad (2019). WIDER FACE [Dataset]. https://www.kaggle.com/datasets/mksaad/wider-face-a-face-detection-benchmark
    Explore at:
    zip(3662993623 bytes)Available download formats
    Dataset updated
    Jan 5, 2019
    Authors
    Motaz Saad
    Description

    Dataset

    This dataset was created by yeheak

    Contents

  16. SoloFace: A Single-Face Dataset for Resource-Constrained Face Detection and...

    • zenodo.org
    zip
    Updated Dec 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Riya Samanta; Riya Samanta; Bidyut Saha; Bidyut Saha (2024). SoloFace: A Single-Face Dataset for Resource-Constrained Face Detection and Tracking [Dataset]. http://doi.org/10.5281/zenodo.14474899
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 15, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Riya Samanta; Riya Samanta; Bidyut Saha; Bidyut Saha
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    SoloFace: A Single-Face Dataset for Resource-Constrained Face Detection and Tracking

    Description
    SoloFace is a custom dataset derived from the COCO-Faces and Visual Wake Word datasets, specifically designed for single-face detection tasks in resource-constrained environments. This dataset is ideal for developing machine learning models for embedded AI applications, such as TinyML, which operate on low-power devices. Each image either contains a single human face or no face, with corresponding labels providing class information and bounding box coordinates for face detection. The dataset includes data augmentation to ensure robustness across diverse conditions, such as variations in lighting, scale, and orientation.

    Dataset Structure
    The dataset is organized into three subsets: train, test, and val. Each subset contains:

    • images/: .jpg image files.
    • labels/: .json label files with matching filenames to the images.

    Label Format
    Each .json label file includes:

    • image: Name of the corresponding image file.
    • class: 1 if a face is present, 0 otherwise.
    • bbox: Normalized bounding box coordinates [top_left_x, top_left_y, bottom_right_x, bottom_right_y]. If no face is present, the bounding box is set to [0.0, 0.0, 0.01, 0.01].

    Statistics

    • Original Dataset:

      • Training images: 11,272
      • Testing images: 3,732
      • Validation images: 434
    • After Data Augmentation:

      • Training images: 56,360
      • Testing and validation images remain unchanged.
    • Class Distribution:

      • 50% of images contain a single visible human face.
      • 50% contain no human face.

    Data Augmentation Details
    To improve model robustness, the following augmentation techniques were applied to the training set:

    1. Geometric Transformations: Random rotation (±15 degrees), scaling (±20%), and horizontal flipping (50%).
    2. Color Transformations: Brightness and contrast adjustments (±30%).
    3. Cropping: Random cropping up to 10% from image edges.

    Each augmentation preserved bounding box consistency with the transformed images.

    Usage This dataset supports the following use cases:

    1. Training lightweight face detection models optimized for microcontroller deployment.
    2. Benchmarking single-face detection models in resource-constrained environments.
    3. Research on model robustness and efficiency.

    Loading the Dataset

    1. Download the dataset.
    2. Extract the dataset using:
      unzip soloface-detection-dataset.zip
      
    3. Dataset structure:
      soloface-detection-dataset/
      ├── train/
      │  ├── images/
      │  ├── labels/
      ├── test/
      │  ├── images/
      │  ├── labels/
      ├── val/
      │  ├── images/
      │  ├── labels/
      

    License
    This dataset is released under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.

    • Permissions: Copy, distribute, and adapt for any purpose, including commercial.
    • Conditions: Provide proper attribution, a link to the license, and indicate changes.
    • Restrictions: No additional legal or technological restrictions.

    For more details, visit the CC BY 4.0 License.

    Contact
    For inquiries or collaborations, please contact:

    • Bidyut Saha: sahabidyut999@gmail.com
    • Riya Samanta: study.riya1792@gmail.com

    This format fits Zenodo's description field requirements while providing clarity and structure. Let me know if further refinements are needed!

  17. Labelled Faces in the Wild (LFW) Dataset

    • kaggle.com
    zip
    Updated Feb 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marvin Luckianto (2024). Labelled Faces in the Wild (LFW) Dataset [Dataset]. https://www.kaggle.com/datasets/marvinluckianto/labelled-faces-in-the-wild-lfw-dataset
    Explore at:
    zip(117895655 bytes)Available download formats
    Dataset updated
    Feb 7, 2024
    Authors
    Marvin Luckianto
    Description

    Context Labeled Faces in the Wild (LFW) is a database of face photographs designed for studying the problem of unconstrained face recognition. This database was created and maintained by researchers at the University of Massachusetts, Amherst (specific references are in Acknowledgments section). 13,233 images of 5,749 people were detected and centered by the Viola Jones face detector and collected from the web. 1,680 of the people pictured have two or more distinct photos in the dataset. The original database contains four different sets of LFW images and also three different types of "aligned" images. According to the researchers, deep-funneled images produced superior results for most face verification algorithms compared to the other image types. Hence, the dataset uploaded here is the deep-funneled version.

    Content There are 11 files in this dataset. lfw-deepfunneled.zip is the file containing the images. All other 10 files are relevant metadata that may help you in forming your training and testing sets for your model. There are two sections below to help you navigate the files better. The first section provides information specifically pertaining to the images. The second section explains the content of each metadata file.

    Image information:

    • Image file format: Each image is available as "lfw/name/name_xxxx.jpg" where "xxxx" is the image number padded to four characters with leading zeroes. For example, the 10th George_W_Bush image can be found as "lfw/George_W_Bush/George_W_Bush_0010.jpg"
    • Image dimensions: Each image is a 250x250 jpg, detected and centered using the openCV implementation of Viola-Jones face detector. The cropping region returned by the detector was then automatically enlarged by a factor of 2.2 in each dimension to capture more of the head and then scaled to a uniform size.

    Metadata information:

    • lfwallnames.csv: Contains all names of each face in the dataset along with number of images each face has.
    • lfwreadme.csv: Comprehensive readme file found on the original database. If there is any information you are missing here or are looking for additional resources you will probably find it in this file. It explains how each .csv file comes into play when forming training and testing models, as well as column metadata information for figuring out what the .csv is talking about. The original website also gives recommendations on training/testing splits and comparison benchmarks.

    There are two recommended configurations for developing training and testing sets (pairs vs people). Depending on which route you choose, you will use the following .csv files:

    • pairs.csv: Contains randomly generated splits for 10-fold cross validation specifically for pairs. Use this for the image restricted configuration when forming training sets (refer to readme). There are 10 total sets; 5 sets contain 300 matched pairs, the other 5 sets contain 300 mismatched pairs.
    • people.csv: Contains randomly generated splits for 10-fold cross validation specifically for individual faces. Use this for the unrestricted configuration when forming training sets (refer to readme). There are 10 total sets, each with a different amount of people; Set 1: 601. Set 2: 555. Set 3: 552. Set 4: 560. Set 5: 567. Set 6: 527. Set 7: 597. Set 8: 601. Set 9: 580. Set 10: 609.
    • matchpairsDevTest.csv: Use this testing set if you decide to go with the pairs configuration. Contains 500 matched pairs of faces for testing set.
    • matchpairsDevTrain.csv: Use this training set if you decide to go with the pairs configuration. Contains 1100 matched pairs of faces for training set.
    • mismatchpairsDevTest.csv: Use this testing set if you decide to go with the pairs configuration. Contains 500 - mismatched pairs of faces for testing set.
    • mismatchpairsDevTrain.csv: Use this training set f you decide to go with the pairs configuration. Contains 1100 mismatched pairs of faces for training set.
    • peopleDevTest.csv: Use this testing test if you decide to go with the people configuration. Contains 1711 people and 3708 images.
    • peopleDevTrain.csv: Use this training set if you decide to go with the people configuration. Contains 4038 people and 9525 images.

    Acknowledgements All data and metadata were originally found on http://vis-www.cs.umass.edu/lfw/. Please visit the site for other data versions including original, non-aligned data as well as more information on errata and training/testing model resources.

    A big thank you and kudos to the creators of this dataset and relevant research:

    Gary B. Huang, Manu Ramesh, Tamara Berg, and Erik Learned-Miller. Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments. University of Massachusetts, Amherst, Technical Report 07-49, October, 2007.

    Specifically for the deep-funneled version of the image data:

    Gary B....

  18. Z

    Face mask detection and masked facial recognition dataset (MDMFR Dataset)

    • data.niaid.nih.gov
    Updated Apr 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NAEEM ULLAH; Ali Javed (2022). Face mask detection and masked facial recognition dataset (MDMFR Dataset) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6408602
    Explore at:
    Dataset updated
    Apr 8, 2022
    Dataset provided by
    University of Engineering and Technology, Taxila
    Authors
    NAEEM ULLAH; Ali Javed
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The unavailability of a unified standard dataset for face mask detection and masked facial recognition motivated us to develop an in-house MDMFR dataset (MDMFR, 2022) to measure the performance of face mask detection and masked facial recognition methods. Both of these tasks have different dataset requirements. Face mask detection requires the images of multiple persons with and without mask. Whereas, masked face recognition requires multiple masked face images of the same person. Our MDMFR dataset consists of two main collections, 1) face mask detection, and 2) masked facial recognition. There are 6006 images in our MDMFR dataset. The face mask detection collection contains two categories of face images i.e., mask and unmask. Our detection database consists of 3174 with mask and 2832 without mask (unmasked) images. To construct the dataset, we captured multiple images of the same person in two configurations (mask and without mask). The masked facial recognition collection contains a total of 2896 masked images of 226 persons. More specifically, our dataset includes the images of both male and female persons of all ages including the children. The images of our dataset are diverse in terms of gender, race, and age of users, types of masks, illumination conditions, face angles, occlusions, environment, format, dimensions, and size, etc. Before being fed to our DeepMaskNet model, all images are scaled to a width and height of 256 pixels. All images have a bit depth of 24. We prepared the images of our dataset for the proposed DeepMaskNet model during preprocessing where images are cropped in Adobe-Photoshop to exclude the extra information like neck and shoulder. As the input size of our Deepmasknet model was 256-by-256, so images were resized to 256-by-256 in publicly available Plastiliq Image Resizer software (Plastiliq, 2022).

  19. d

    FileMarket | Diverse Human Face Data | 20,000 IDs | Face Recognition Data |...

    • datarade.ai
    Updated Jul 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FileMarket (2024). FileMarket | Diverse Human Face Data | 20,000 IDs | Face Recognition Data | Image/Video AI Training Data | Biometric Data [Dataset]. https://datarade.ai/data-products/filemarket-diverse-human-face-data-20-000-ids-face-reco-filemarket
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Jul 5, 2024
    Dataset authored and provided by
    FileMarket
    Area covered
    Georgia, Oman, Iceland, Sri Lanka, Hong Kong, Libya, Curaçao, United Kingdom, Martinique, Kyrgyzstan
    Description

    Biometric Data

    FileMarket provides a comprehensive Biometric Data set, ideal for enhancing AI applications in security, identity verification, and more. In addition to Biometric Data, we offer specialized datasets across Object Detection Data, Machine Learning (ML) Data, Large Language Model (LLM) Data, and Deep Learning (DL) Data. Each dataset is meticulously crafted to support the development of cutting-edge AI models.

    Data Size: 20,000 IDs

    Race Distribution: The dataset encompasses individuals from diverse racial backgrounds, including Black, Caucasian, Indian, and Asian groups.

    Gender Distribution: The dataset equally represents all genders, ensuring a balanced and inclusive collection.

    Age Distribution: The data spans a broad age range, including young, middle-aged, and senior individuals, providing comprehensive age coverage.

    Collection Environment: Data has been gathered in both indoor and outdoor environments, ensuring variety and relevance for real-world applications.

    Data Diversity: This dataset includes a rich variety of face poses, racial backgrounds, age groups, lighting conditions, and scenes, making it ideal for robust biometric model training.

    Device: All data has been collected using mobile phones, reflecting common real-world usage scenarios.

    Data Format: The data is provided in .jpg and .png formats, ensuring compatibility with various processing tools and systems.

    Accuracy: The labels for face pose, race, gender, and age are highly accurate, exceeding 95%, making this dataset reliable for training high-performance biometric models.

  20. n

    Perceptual expertise in forensic facial image comparison

    • data.niaid.nih.gov
    • dataone.org
    • +2more
    zip
    Updated Sep 24, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David White; P. Jonathan Phillips; Carina A. Hahn; Matthew Hill; Alice J. O'Toole; P. Jonathon Phillips (2015). Perceptual expertise in forensic facial image comparison [Dataset]. http://doi.org/10.5061/dryad.ng720
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 24, 2015
    Dataset provided by
    National Institute of Standards and Technology
    The University of Texas at Dallas
    Authors
    David White; P. Jonathan Phillips; Carina A. Hahn; Matthew Hill; Alice J. O'Toole; P. Jonathon Phillips
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Forensic facial identification examiners are required to match the identity of faces in images that vary substantially, owing to changes in viewing conditions and in a person's appearance. These identifications affect the course and outcome of criminal investigations and convictions. Despite calls for research on sources of human error in forensic examination, existing scientific knowledge of face matching accuracy is based, almost exclusively, on people without formal training. Here, we administered three challenging face matching tests to a group of forensic examiners with many years' experience of comparing face images for law enforcement and government agencies. Examiners outperformed untrained participants and computer algorithms, thereby providing the first evidence that these examiners are experts at this task. Notably, computationally fusing responses of multiple experts produced near-perfect performance. Results also revealed qualitative differences between expert and non-expert performance. First, examiners' superiority was greatest at longer exposure durations, suggestive of more entailed comparison in forensic examiners. Second, experts were less impaired by image inversion than non-expert students, contrasting with face memory studies that show larger face inversion effects in high performers. We conclude that expertise in matching identity across unfamiliar face images is supported by processes that differ qualitatively from those supporting memory for individual faces.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
The Chinese University of Hong Kong (2022). wider_face [Dataset]. https://huggingface.co/datasets/CUHK-CSE/wider_face

wider_face

WIDER FACE

CUHK-CSE/wider_face

Explore at:
19 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Jan 13, 2022
Dataset authored and provided by
The Chinese University of Hong Kong
License

Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically

Description

WIDER FACE dataset is a face detection benchmark dataset, of which images are selected from the publicly available WIDER dataset. We choose 32,203 images and label 393,703 faces with a high degree of variability in scale, pose and occlusion as depicted in the sample images. WIDER FACE dataset is organized based on 61 event classes. For each event class, we randomly select 40%/10%/50% data as training, validation and testing sets. We adopt the same evaluation metric employed in the PASCAL VOC dataset. Similar to MALF and Caltech datasets, we do not release bounding box ground truth for the test images. Users are required to submit final prediction files, which we shall proceed to evaluate.

Search
Clear search
Close search
Google apps
Main menu