VGGFace2 is a large-scale face recognition dataset. Images are downloaded from Google Image Search and have large variations in pose, age, illumination, ethnicity and profession. VGGFace2 contains images from identities spanning a wide range of different ethnicities, accents, professions and ages. All face images are captured "in the wild", with pose and emotion variations and different lighting and occlusion conditions. Face distribution for different identities is varied, from 87 to 843, with an average of 362 images for each subject.
This dataset was created by Shrijeet16
This dataset was created by Ansari
This dataset was created by veeru
This dataset was created by Shiv Sharma
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by twang_hcmut
Released under Apache 2.0
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This Dataset was made for experiments to algorithms to face detaction. - Yolo - VGG Face - CNN
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This dataset has been used in this paper: Face Clustering for Connection Discovery from Event Images (pdf here)
Data was collected from pailixiang.com, a Chinese photo live platform. The event organizer uploads event images to the website during the event, and they are shared publicly online. Images do not come with information other than the upload time and the number of views. As there is no identity information available, faces are labeled with the identity manually using a custom-developed software. After manual labeling, there are over 3,000 participants labeled from over 40,000 faces and 8,837 images in the data set.
In the dataset:
Note that the faces are detected using mtcnn
VGGSound
VGG-Sound is an audio-visual correspondent dataset consisting of short clips of audio sounds, extracted from videos uploaded to YouTube.
Homepage: https://www.robots.ox.ac.uk/~vgg/data/vggsound/ Paper: https://arxiv.org/abs/2004.14368 Github: https://github.com/hche11/VGGSound
Analysis
310+ classes: VGG-Sound contains audios spanning a large number of challenging acoustic environments and noise characteristics of real applications. 200,000+ videos: All… See the full description on the dataset page: https://huggingface.co/datasets/Loie/VGGSound.
This dataset was created by mahmoudbelooo
This dataset was created by Bhargavi
BeyondDeepFakeDetection/VGG dataset hosted on Hugging Face and contributed by the HF Datasets community
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Background: Williams-Beuren syndrome (WBS) is a rare genetic syndrome with a characteristic “elfin” facial gestalt. The “elfin” facial characteristics include a broad forehead, periorbital puffiness, flat nasal bridge, short upturned nose, wide mouth, thick lips, and pointed chin. Recently, deep convolutional neural networks (CNNs) have been successfully applied to facial recognition for diagnosing genetic syndromes. However, there is little research on WBS facial recognition using deep CNNs.Objective: The purpose of this study was to construct an automatic facial recognition model for WBS diagnosis based on deep CNNs.Methods: The study enrolled 104 WBS children, 91 cases with other genetic syndromes, and 145 healthy children. The photo dataset used only one frontal facial photo from each participant. Five face recognition frameworks for WBS were constructed by adopting the VGG-16, VGG-19, ResNet-18, ResNet-34, and MobileNet-V2 architectures, respectively. ImageNet transfer learning was used to avoid over-fitting. The classification performance of the facial recognition models was assessed by five-fold cross validation, and comparison with human experts was performed.Results: The five face recognition frameworks for WBS were constructed. The VGG-19 model achieved the best performance. The accuracy, precision, recall, F1 score, and area under curve (AUC) of the VGG-19 model were 92.7 ± 1.3%, 94.0 ± 5.6%, 81.7 ± 3.6%, 87.2 ± 2.0%, and 89.6 ± 1.3%, respectively. The highest accuracy, precision, recall, F1 score, and AUC of human experts were 82.1, 65.9, 85.6, 74.5, and 83.0%, respectively. The AUCs of each human expert were inferior to the AUCs of the VGG-16 (88.6 ± 3.5%), VGG-19 (89.6 ± 1.3%), ResNet-18 (83.6 ± 8.2%), and ResNet-34 (86.3 ± 4.9%) models.Conclusions: This study highlighted the possibility of using deep CNNs for diagnosing WBS in clinical practice. The facial recognition framework based on VGG-19 could play a prominent role in WBS diagnosis. Transfer learning technology can help to construct facial recognition models of genetic syndromes with small-scale datasets.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
@article{DBLP:journals/corr/abs-1710-08092, author = {Qiong Cao and Li Shen and Weidi Xie and Omkar M. Parkhi and Andrew Zisserman}, title = {VGGFace2: {A} dataset for recognising faces across pose and age}, journal = {CoRR}, volume = {abs/1710.08092}, year = {2017}, url = {http://arxiv.org/abs/1710.08092}, eprinttype = {arXiv}, eprint = {1710.08092}… See the full description on the dataset page: https://huggingface.co/datasets/ProgramComputer/VGGFace2.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Face Recognition Model using VGG****
This model leverages a VGG architecture to perform face recognition. It is trained to recognize and classify faces by extracting deep facial features. The VGG-based model provides high accuracy by utilizing a pre-trained convolutional neural network, fine-tuned for the task of face identification. The dataset consists of labeled facial images, and the model achieves reliable recognition across various face datasets. Ideal for applications in attendance systems, security, and personal identification systems.
ZZZtong/common-accent-vgg-ready dataset hosted on Hugging Face and contributed by the HF Datasets community
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by Duong Thanh Tran
Released under CC0: Public Domain
Dataset Card for Flowers Dataset
Dataset Summary
VGG have created a 17 category flower dataset with 80 images for each class. The flowers chosen are some common flowers in the UK. The images have large scale, pose and light variations and there are also classes with large varations of images within the class and close similarity to other classes. The categories can be seen in the figure below. We randomly split the dataset into 3… See the full description on the dataset page: https://huggingface.co/datasets/Guldeniz/flower_dataset.
Dataset Card for "MJSynth_text_recognition"
This is the MJSynth dataset for text recognition on document images, synthetically generated, covering 90K English words. It includes training, validation and test splits. Source of the dataset: https://www.robots.ox.ac.uk/~vgg/data/text/ Use dataset streaming functionality to try out the dataset quickly without downloading the entire dataset (refer: https://huggingface.co/docs/datasets/stream) Citation details provided on the source… See the full description on the dataset page: https://huggingface.co/datasets/priyank-m/MJSynth_text_recognition.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
The Oxford-IIIT Pet Dataset
Description
A 37 category pet dataset with roughly 200 images for each class. The images have a large variations in scale, pose and lighting. This instance of the dataset uses standard label ordering and includes the standard train/test splits. Trimaps and bbox are not included, but there is an image_id field that can be used to reference those annotations from official metadata. Website: https://www.robots.ox.ac.uk/~vgg/data/pets/… See the full description on the dataset page: https://huggingface.co/datasets/timm/oxford-iiit-pet.
VGGFace2 is a large-scale face recognition dataset. Images are downloaded from Google Image Search and have large variations in pose, age, illumination, ethnicity and profession. VGGFace2 contains images from identities spanning a wide range of different ethnicities, accents, professions and ages. All face images are captured "in the wild", with pose and emotion variations and different lighting and occlusion conditions. Face distribution for different identities is varied, from 87 to 843, with an average of 362 images for each subject.