27 datasets found

T
celeb_a
tensorflow.org
datasetninja.com
+3more
Updated Jun 1, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). celeb_a [Dataset]. https://www.tensorflow.org/datasets/catalog/celeb_a
Explore at:
Dataset updated
Jun 1, 2024
Description
CelebFaces Attributes Dataset (CelebA) is a large-scale face attributes dataset with more than 200K celebrity images, each with 40 attribute annotations. The images in this dataset cover large pose variations and background clutter. CelebA has large diversities, large quantities, and rich annotations, including - 10,177 number of identities, - 202,599 number of face images, and - 5 landmark locations, 40 binary attributes annotations per image.

The dataset can be employed as the training and test sets for the following computer vision tasks: face attribute recognition, face detection, and landmark (or facial part) localization.

Note: CelebA dataset may contain potential bias. The fairness indicators example goes into detail about several considerations to keep in mind while using the CelebA dataset.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('celeb_a', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/celeb_a-2.1.0.png" alt="Visualization" width="500px">
celeba
huggingface.co
Updated Jan 6, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Flower Labs (2025). celeba [Dataset]. https://huggingface.co/datasets/flwrlabs/celeba
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 6, 2025
Dataset provided by
Flower Labs GmbH
Authors
Flower Labs
License
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Description
Dataset Card for Dataset Name

CelebFaces Attributes Dataset (CelebA) is a large-scale face attributes dataset with more than 200K celebrity images, each with 40 attribute annotations. The images in this dataset cover large pose variations and background clutter. CelebA has large diversities, large quantities, and rich annotations, including:

10,177 number of identities,

202,599 number of face images, and

5 landmark locations, 40 binary attributes annotations per image.

The… See the full description on the dataset page: https://huggingface.co/datasets/flwrlabs/celeba.
CelebA-HQ
opendatalab.com
tensorflow.org
+1more
zip
Updated Mar 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NVIDIA (2023). CelebA-HQ [Dataset]. https://opendatalab.com/OpenDataLab/CelebA-HQ
Explore at:
zip(39050869273 bytes)Available download formats
Dataset updated
Mar 7, 2023
Dataset provided by
英伟达http://nvidia.com/
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
The CelebA-HQ dataset is a high-quality version of CelebA that consists of 30,000 images at 1024×1024 resolution.
a
CelebAMask-HQ
academictorrents.com
paperswithcode.com
bittorrent
Updated Aug 13, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
None (2022). CelebAMask-HQ [Dataset]. https://academictorrents.com/details/b5738811260a33d02bfb781f7251b15bee7bb987
Explore at:
bittorrent(3153930546)Available download formats
Dataset updated
Aug 13, 2022
Authors
None
License
https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
Description
CelebAMask-HQ is a large-scale face image dataset that has 30,000 high-resolution face images selected from the CelebA dataset by following CelebA-HQ. Each image has segmentation mask of facial attributes corresponding to CelebA. The masks of CelebAMask-HQ were manually-annotated with the size of 512 x 512 and 19 classes including all facial components and accessories such as skin, nose, eyes, eyebrows, ears, mouth, lip, hair, hat, eyeglass, earring, necklace, neck, and cloth. CelebAMask-HQ can be used to train and evaluate algorithms of face parsing, face recognition, and GANs for face generation and editing. ## Sample Images ## Face Manipulation Model with CelebAMask-HQ CelebAMask-HQ can be used on several research fields including: facial image manipulation, face parsing, face recognition, and face hallucination. We showcase an application on interactive facial image manipulation as bellow: * Sa
O
Multi-Modal CelebA-HQ
opendatalab.com
zip
Updated Apr 20, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SenseTime Research (2023). Multi-Modal CelebA-HQ [Dataset]. https://opendatalab.com/OpenDataLab/Multi-Modal_CelebA-HQ
Explore at:
zipAvailable download formats
Dataset updated
Apr 20, 2023
Dataset provided by
University of Hong Kong
SenseTime Research
Chinese University of Hong Kong
License
https://github.com/weihaox/Multi-Modal-CelebA-HQ-Datasethttps://github.com/weihaox/Multi-Modal-CelebA-HQ-Dataset
Description
Multi-Modal-CelebA-HQ is a large-scale face image dataset that has 30,000 high-resolution face images selected from the CelebA dataset by following CelebA-HQ. Each image has high-quality segmentation mask, sketch, descriptive text, and image with transparent background. Multi-Modal-CelebA-HQ can be used to train and evaluate algorithms of text-to-image generation, text-guided image manipulation, sketch-to-image generation, image caption, and VQA. This dataset is proposed and used in TediGAN.
h
CelebAHairMask-HQ
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
gaozhihan, CelebAHairMask-HQ [Dataset]. https://huggingface.co/datasets/cpuimage/CelebAHairMask-HQ
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Authors
gaozhihan
Description
CelebAHairMask-HQ

CelebAHairMask-HQ is a extended dataset of CelebAMask-HQ for hair segmentation or hair matting. CelebAMask-HQ is a large-scale face image dataset that has 30,000 high-resolution face images selected from the CelebA dataset by following CelebA-HQ. Each image has segmentation mask of facial attributes corresponding to CelebA. The masks of CelebAHairMask-HQ were auto-annotated with the size of 1024 x 1024. CelebAHairMask-HQ can be used to train and evaluate… See the full description on the dataset page: https://huggingface.co/datasets/cpuimage/CelebAHairMask-HQ.
P
CelebA Dataset
paperswithcode.com
Updated Aug 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ziwei Liu; Ping Luo; Xiaogang Wang; Xiaoou Tang (2023). CelebA Dataset [Dataset]. https://paperswithcode.com/dataset/celeba
Explore at:
Dataset updated
Aug 28, 2023
Authors
Ziwei Liu; Ping Luo; Xiaogang Wang; Xiaoou Tang
Description
CelebFaces Attributes dataset contains 202,599 face images of the size 178×218 from 10,177 celebrities, each annotated with 40 binary labels indicating facial attributes like hair color, gender and age.
h
CelebA-females
huggingface.co
Updated Apr 19, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MnLgt (2025). CelebA-females [Dataset]. https://huggingface.co/datasets/MnLgt/CelebA-females
Explore at:
Dataset updated
Apr 19, 2025
Authors
MnLgt
Description
CelebA Female Dataset

Dataset Description

This dataset is a filtered subset of the CelebA dataset (Celebrities Faces Attributes), containing only female faces. The original CelebA dataset is a large-scale face attributes dataset with more than 200,000 celebrity images, each with 40 attribute annotations.

Dataset Creation

This dataset was created by:

Loading the original CelebA dataset Filtering to keep only images labeled as female (based on the "Male"… See the full description on the dataset page: https://huggingface.co/datasets/MnLgt/CelebA-females.
P
CelebA-Spoof Dataset
paperswithcode.com
library.toponeai.link
Updated Oct 17, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yuanhan Zhang; Zhenfei Yin; Yidong Li; Guojun Yin; Junjie Yan; Jing Shao; Ziwei Liu (2023). CelebA-Spoof Dataset [Dataset]. https://paperswithcode.com/dataset/celeba-spoof
Explore at:
Dataset updated
Oct 17, 2023
Authors
Yuanhan Zhang; Zhenfei Yin; Yidong Li; Guojun Yin; Junjie Yan; Jing Shao; Ziwei Liu
Description
CelebA-Spoof is a large-scale face anti-spoofing dataset with the following properties:

Quantity: CelebA-Spoof comprises of 625,537 pictures of 10,177 subjects, significantly larger than the existing datasets. Diversity: The spoof images are captured from 8 scenes (2 environments * 4 illumination conditions) with more than 10 sensors. Annotation Richness: CelebA-Spoof contains 10 spoof type annotations, as well as the 40 attribute annotations inherited from the original CelebA dataset.
P
Data from: KID-F Dataset
paperswithcode.com
Updated Jul 30, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). KID-F Dataset [Dataset]. https://paperswithcode.com/dataset/kid-f
Explore at:
Dataset updated
Jul 30, 2022
Description
Description K-pop Idol Dataset - Female (KID-F) is the first dataset of K-pop idol high quality face images. It consists of about 6,000 high quality face images at 512x512 resolution and identity labels for each image.

We collected about 90,000 K-pop female idol images and crop the face from each image. And we classified high quality face images. As a result, there are about 6,000 high quality face images in this dataset.

There are 300 test datasets for a benchmark. There are no duplicate images between test and train images. Some identities in test images are not duplicated with train images. (It means some test images is new identity to the trained model) Each test images have its degraded pair. You can use these degraded test images for testing face super resolution performance.

We also provide identity labels for each image. You can download the csv file from our github

Download You can download dataset from here. Google Drive

Agreement

The use of this software is RESTRICTED to non-commercial research and educational purposes. All images of the KID-F dataset are obtained from the internet which are not property of EDA(PCEO-AI-CLUB). EDA is not responsible for the content nor the meaning of these images. You agree not to reproduce, duplicate, copy, sell, trade, resell or exploit for any commercial purposes, any portion of the images and any portion of derived data. You agree not to further copy, publish or distribute any portion of the KID-F dataset. Except, for internal use at a single site within the same organization it is allowed to make copies of the dataset. EDA reserves the right to terminate your access to the CelebA dataset at any time.
Gender Classification 20K Images | CelebA
kaggle.com
Updated May 21, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ashish Jangra (2020). Gender Classification 20K Images | CelebA [Dataset]. https://www.kaggle.com/ashishjangra27/gender-detection-20k-images-celeba/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 21, 2020
Dataset provided by
Kaggle
Authors
Ashish Jangra
Description
Context This dataset is used for Gender Classification with images. The dataset consists of almost 20K images which are almost 132 MB in size.

Acknowledgments This dataset is preprocessed from the CelebFace dataset created by Jessica Li (https://www.kaggle.com/jessicali9530). Thank you so much Jessica for providing a wonderful dataset to the community.

Inspiration The inspiration to create this dataset is the CelebFace dataset created by Jessica Li (https://www.kaggle.com/jessicali9530). I have extracted this dataset from the CelebFace dataset so that you can directly use the dataset for Gender Classification without preprocessing it.
AUTH-OpenDR ACelebA Dataset
zenodo.org
pdf
Updated Jul 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Eftstratios Kakaletsis; Nikos Nikolaidis; Eftstratios Kakaletsis; Nikos Nikolaidis (2024). AUTH-OpenDR ACelebA Dataset [Dataset]. http://doi.org/10.5281/zenodo.5809273
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5809273
Dataset updated
Jul 17, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Eftstratios Kakaletsis; Nikos Nikolaidis; Eftstratios Kakaletsis; Nikos Nikolaidis
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Multi-view Facial Image Dataset Based on CelebA: A dataset of facial images from several viewing angles was created by Aristotle University of Thessaloniki based on the CelebA image dataset, using the software that was created in OpenDR H2020 research project based on this paper and the respective code provided by the authors. CelebA is a large scale facial dataset and consists of 202,599 facial images of 10,177 celebrities captured in the wild. The new dataset namely AUTH-OpenDR Augmented CelebA (AUTH-OpenDR ACelebA) was generated from 140,000 facial images corresponding to 9161 persons, i.e. a subset of CelebA was used. For each CelebA image used, 13 synthetic images generated by yaw axis camera rotation in the interval [0◦ : +60◦ ] with step +5◦ were obtained. Moreover, 10 synthetic images generated by pitch axis camera rotation in the interval [0◦: +45◦] with step +5◦ are also created for each facial image of the aforementioned dataset. Since CelebA license does not allow distribution of derivative work we do not make AcelebA directly available but instead provide instructions and scripts on how to recreate it.
a
Data from: MS-Celeb-1M: {A} Dataset and Benchmark for Large-Scale Face...
academictorrents.com
bittorrent
Updated Jun 4, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yandong Guo and Lei Zhang and Yuxiao Hu and Xiaodong He and Jianfeng Gao (2019). MS-Celeb-1M: {A} Dataset and Benchmark for Large-Scale Face Recognition [Dataset]. https://academictorrents.com/details/9e67eb7cc23c9417f39778a8e06cca5e26196a97
Explore at:
bittorrent(246390693904)Available download formats
Dataset updated
Jun 4, 2019
Dataset authored and provided by
Yandong Guo and Lei Zhang and Yuxiao Hu and Xiaodong He and Jianfeng Gao
License
https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
Description
In this paper, we design a benchmark task and provide the associated datasets for recognizing face images and link them to corresponding entity keys in a knowledge base. More specifically, we propose a benchmark task to recognize one million celebrities from their face images, by using all the possibly collected face images of this individual on the web as training data. The rich information provided by the knowledge base helps to conduct disambiguation and improve the recognition accuracy, and contributes to various real-world applications, such as image captioning and news video analysis. Associated with this task, we design and provide concrete measurement set, evaluation protocol, as well as training data. We also present in details our experiment setup and report promising baseline results. Our benchmark task could lead to one of the largest classification problems in computer vision. To the best of our knowledge, our training dataset, which contains 10M images in version 1, is th
O
CelebA-Dialog
opendatalab.com
zip
Updated Mar 24, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chinese University of Hong Kong (2023). CelebA-Dialog [Dataset]. https://opendatalab.com/OpenDataLab/CelebA-Dialog
Explore at:
zipAvailable download formats
Dataset updated
Mar 24, 2023
Dataset provided by
Chinese University of Hong Kong
S-Lab, Nanyang Technological University
License
https://mmlab.ie.cuhk.edu.hk/projects/CelebA/CelebA_Dialog.htmlhttps://mmlab.ie.cuhk.edu.hk/projects/CelebA/CelebA_Dialog.html
Description
CelebA-Dialog is a large-scale visual-language face dataset with the following features. Facial images are annotated with rich fine-grained labels, which classify one attribute into multiple degrees according to its semantic meaning. Accompanied with each image, there are textual captions describing the attributes and a user editing request sample. CelebA-Dialog has:10,177 number of identities, 202,599 number of face images, and 5 fine-grained attributes annotations per image: Bangs, Eyeglasses, Beard, Smiling, and Age, Textual captions and a user editing request per image.
P
CACD Dataset
paperswithcode.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bor-Chun Chen; Chu-Song Chen; Winston H. Hsu, CACD Dataset [Dataset]. https://paperswithcode.com/dataset/cacd
Explore at:
Authors
Bor-Chun Chen; Chu-Song Chen; Winston H. Hsu
Description
The Cross-Age Celebrity Dataset (CACD) contains 163,446 images from 2,000 celebrities collected from the Internet. The images are collected from search engines using celebrity name and year (2004-2013) as keywords. Therefore, it is possible to estimate the ages of the celebrities on the images by simply subtract the birth year from the year of which the photo was taken.
Beard or No Beard
kaggle.com
Updated Sep 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Muhammad Uzair Khan (2023). Beard or No Beard [Dataset]. https://www.kaggle.com/datasets/uzair01/beard-or-no-beard/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 30, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Muhammad Uzair Khan
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Overview: Welcome to the Celebrity Facial Transformations dataset, a meticulously curated collection of high-resolution (500 x 500 pixels) images showcasing famous personalities and celebrities with and without facial hair, specifically beards. Collected from Google Images, this dataset provides a fascinating glimpse into the ever-changing appearances of renowned figures.

Key Features: - Celebrity Variety: Explore a wide range of celebrities from various fields, including actors, musicians, politicians, and athletes. - Facial Transformation: Witness the striking contrast between their bearded and clean-shaven looks, allowing for in-depth analysis of how facial hair can alter one's appearance. - High Resolution: All images are provided in a standardized high-resolution format (500 x 500 pixels), ensuring exceptional image quality for research and analysis.

Data Collection: The images were collected from Google Images using a systematic and careful approach to ensure image quality and relevance. Metadata, such as celebrity names and source URLs, is included for reference.

Data Format: All images are in a uniform high-resolution format (500 x 500 pixels) and jpeg format to maintain consistency and provide greater detail.

Disclaimer: This dataset is intended for research and educational purposes. Users are responsible for adhering to copyright and usage rights when using the images for any purpose beyond research.

Unlock new insights into celebrity transformations with this updated dataset, now featuring high-resolution images that provide even greater value for research and analysis.
S
Large-scale datasets for facial tampering detection with inpainting...
scidb.cn
Updated Apr 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
liwei (2025). Large-scale datasets for facial tampering detection with inpainting techniques [Dataset]. http://doi.org/10.57760/sciencedb.23047
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57760/sciencedb.23047
Dataset updated
Apr 9, 2025
Dataset provided by
Science Data Bank
Authors
liwei
Description
DeepFake technology， born with the continuous maturation of deep learning techniques， primarily utilizes neural networks to create non-realistic faces. This method has enriched people’s lives as computer vision advances and deep learning technologies mature. It has revolutionized the film industry by generating astonishing visuals and reducing production costs. Similarly， in the gaming industry， it has facilitated the creation of smooth and realistic animation effects. However， the malicious use of image manipulation to spread false information poses significant risks to society， casting doubt on the authenticity of digital content in visual media. Forgery techniques encompass four main categories： face reenactment， face replacement， face editing， and face synthesis. Face editing， a commonly employed image manipulation method， involves falsifying facial features by modifying the information related to the five facial regions. As one of the commonly employed methods in facial editing， image inpainting technology involves utilizing known content from an image to fill in missing areas， aiming to restore the image in a way that aligns as closely as possible with human perception. In the context of facial forgery， image inpainting is primarily used for identity falsification， wherein facial features are altered to achieve the goal of replacing a face. The use of image inpainting for facial manipulation similarly introduces significant disruption to people’s lives. To support research on detection methods for such manipulations， this paper produced a large-scale dataset for face manipulation detection based on inpainting techniques. This paper specifically focuses on the field of image tampering detection， utilizing two classic datasets： the high-quality CelebA-HQ dataset， comprising 25 000 high-resolution （1 024 × 1 024 pixels） celebrity face images， and the low-quality FF++ dataset， consisting of 15 000 face images extracted from video frames. On the basis of the two datasets， facial feature regions （eyebrows， eyes， nose， mouth， and the entire facial area） are segmented using image segmentation methods. Corresponding mask images are created， and the segmented facial regions are directly obscured on the original image. Two deep neural network-based inpainting methods （image inpainting via conditional texture and structure dual generation （CTSDG） and recurrent feature reasoning for image inpainting （RFR）） along with a traditional inpainting method （struct completion（SC）） were employed. The deep neural network methods require the provision of mask images to indicate the areas for inpainting， while the traditional method could directly perform inpainting on segmented facial feature images. The facial regions were inpainted using these three methods， resulting in a large-scale dataset comprising 600 000 images. This extensive dataset incorporates diverse pre-processing techniques， various inpainting methods， and includes images with different qualities and inpainted facial regions. It serves as a valuable resource for training and testing in related detection tasks， offering a rich dataset for subsequent research in the field， and also establishes a meaningful benchmark dataset for future studies in the domain of face tampering detection.
O
CelebAGaze
opendatalab.com
paperswithcode.com
zip
Updated Apr 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Texas State University (2023). CelebAGaze [Dataset]. https://opendatalab.com/OpenDataLab/CelebAGaze
Explore at:
zip(560485339 bytes)Available download formats
Dataset updated
Apr 20, 2023
Dataset provided by
Shandong University
University of Trento
Texas State University
École Polytechnique Fédérale de Lausanne
Huawei Research Ireland
Description
CelebAGaze consists of 25283 high-resolution celebrity images that are collected from CelebA and the Internet. It consists of 21832 face images with eyes staring at the camera and 3451 face images with eyes staring somewhere else. All images (256 × 256) are cropped and the eye mask region by dlib is computed. Specifically, dlib is used to extract 68 facial landmarks and calculate the mean of 6 points near the eye region, which will be the center point of the mask. The size of the mask is fixed to 30×50. As described above, 300 samples from domain Y are randomly selected, 100 samples from domain X as the test set, the remaining as the training set. Note that this dataset is unpaired and it is not labeled with the specific eye angle or the head pose information.
P
OSN-transmission_mini_CelebA Dataset
paperswithcode.com
Updated Mar 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). OSN-transmission_mini_CelebA Dataset [Dataset]. https://paperswithcode.com/dataset/osn-transmission-mini-celeba
Explore at:
Dataset updated
Mar 4, 2024
Description
This is the paper “DF-RAP: A Robust Adversarial Perturbation for Defending against Deepfakes in Real-world Social Network Scenarios" OSN-transmission CelebA sampling dataset collected by manual upload and download. This dataset includes 30,000 facial images of size 256×256 transmitted through online social networks (OSN) and their corresponding original images. Among them, Facebook, Twitter, WeChat and Weibo were selected as the transmission OSN, with 7500 images each.
MS1MV2 112x112
kaggle.com
Updated Dec 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
yakhyo (2024). MS1MV2 112x112 [Dataset]. https://www.kaggle.com/datasets/yakhyokhuja/ms1m-arcface-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 9, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
yakhyo
Description
Please see https://github.com/yakhyo/face-recognition to train Face Recognition model.

MS1M-ArcFace Dataset Description

The MS1M-ArcFace dataset is a cleaned and refined version of the original MS-Celeb-1M dataset, specifically curated for face recognition tasks. This dataset was processed to remove noisy and misaligned images, improving its quality and usability in training robust face recognition models.

Features - Image Size: 112x112 pixels - Classes: 85742 - Aligned: Standardized facial landmarks

Info - Dataset Origin: Based on the MS-Celeb-1M dataset, originally released by Microsoft - Research. - Purpose: Designed to facilitate research and development in face recognition, particularly for high-accuracy models. - Data: Contains millions of images of celebrity faces, preprocessed and aligned for optimal model training. - Preprocessing: Cleaned and refined using advanced methods to reduce noise, mislabels, and inaccuracies. - Applications: Used in training state-of-the-art models like ArcFace for tasks such as identity verification, facial feature extraction, and more. - License: Users should verify compliance with ethical and licensing requirements before using or distributing the dataset. - This dataset has been extensively used in academic and industrial research for benchmarking and developing cutting-edge face recognition systems.

Please refer to the original source of this dataset for additional information. It's released here for academic purposes only.

Facebook

Twitter

Click to copy link

Link copied

Cite

(2024). celeb_a [Dataset]. https://www.tensorflow.org/datasets/catalog/celeb_a

celeb_a

Explore at:

34 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Jun 1, 2024

Description

CelebFaces Attributes Dataset (CelebA) is a large-scale face attributes dataset with more than 200K celebrity images, each with 40 attribute annotations. The images in this dataset cover large pose variations and background clutter. CelebA has large diversities, large quantities, and rich annotations, including - 10,177 number of identities, - 202,599 number of face images, and - 5 landmark locations, 40 binary attributes annotations per image.

The dataset can be employed as the training and test sets for the following computer vision tasks: face attribute recognition, face detection, and landmark (or facial part) localization.

Note: CelebA dataset may contain potential bias. The fairness indicators example goes into detail about several considerations to keep in mind while using the CelebA dataset.

To use this dataset:

import tensorflow_datasets as tfds

ds = tfds.load('celeb_a', split='train')
for ex in ds.take(4):
 print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/celeb_a-2.1.0.png" alt="Visualization" width="500px">

Clear search

Close search

Google apps

Main menu

celeb_a

celeba

CelebA-HQ

CelebAMask-HQ

Multi-Modal CelebA-HQ

CelebAHairMask-HQ

CelebA Dataset

CelebA-females

CelebA-Spoof Dataset

Data from: KID-F Dataset

Gender Classification 20K Images | CelebA

AUTH-OpenDR ACelebA Dataset

Data from: MS-Celeb-1M: {A} Dataset and Benchmark for Large-Scale Face...

CelebA-Dialog

CACD Dataset

Beard or No Beard

Large-scale datasets for facial tampering detection with inpainting...

CelebAGaze

OSN-transmission_mini_CelebA Dataset

MS1MV2 112x112

celeb_aSee More Versions

celeb_a