8 datasets found

Faces Dataset all at one place
kaggle.com
Updated Feb 24, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shanmukh (2021). Faces Dataset all at one place [Dataset]. https://www.kaggle.com/datasets/shanmukh05/vggface-using-tripletloss/versions/18
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 24, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Shanmukh
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Context

As name of dataset says, this dataset contains the variety of face datasets available.

Content

CFP data folder: This folder consists of around 5000 images distributed among 500 persons (10 each). source celebs folder This folder contains images 100 bollywood actors. A total of 10029 images are present. source images resolute This folder contains images of over 664 persons across the world. (Approximately size is 1.3GB ) dataset folder This folder consists low resolution images of 158 persons. crop faces folder This folder contains cropped faces of dataset folder. Cropping is done with MTCNN library.

vgg face weights h5 file Pretrained weights of VGG Facenet model. For more details visit VGG face recognition
Bollywood Celebrity Faces
kaggle.com
Updated Mar 21, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rajesh Kumar (2020). Bollywood Celebrity Faces [Dataset]. https://www.kaggle.com/havingfun/100-bollywood-celebrity-faces/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 21, 2020
Dataset provided by
Kaggle
Authors
Rajesh Kumar
Description
Context

While learning different Deep learning techniques and trying different existing models, I thought of creating an app to find "Which celeb you look like" app using feature extractions and matching. I used these images to train the model. All these images are downloaded using bing image downloader.

Content

Fetched the list of top 100 celebrities from - https://www.bollywoodhungama.com/celebrities/top-100/

List of the celebs - Aamir Khan Abhay Deol Abhishek Bachchan Aftab Shivdasani Aishwarya Rai Ajay Devgn Akshay Kumar Akshaye Khanna Alia Bhatt Ameesha Patel Amitabh Bachchan Amrita Rao Amy Jackson Anil Kapoor Anushka Sharma Anushka Shetty Arjun Kapoor Arjun Rampal Arshad Warsi Asin Ayushmann Khurrana Bhumi Pednekar Bipasha Basu Bobby Deol Deepika Padukone Disha Patani Emraan Hashmi Esha Gupta Farhan Akhtar Govinda Hrithik Roshan Huma Qureshi Ileana D’Cruz Irrfan Khan Jacqueline Fernandez John Abraham Juhi Chawla Kajal Aggarwal Kajol Kangana Ranaut Kareena Kapoor Karisma Kapoor Kartik Aaryan Katrina Kaif Kiara Advani Kriti Kharbanda Kriti Sanon Kunal Khemu Lara Dutta Madhuri Dixit Manoj Bajpayee Mrunal Thakur Nana Patekar Nargis Fakhri Naseeruddin Shah Nushrat Bharucha Paresh Rawal Parineeti Chopra Pooja Hegde Prabhas Prachi Desai Preity Zinta Priyanka Chopra R Madhavan Rajkummar Rao Ranbir Kapoor Randeep Hooda Rani Mukerji Ranveer Singh Richa Chadda Riteish Deshmukh Saif Ali Khan Salman Khan Sanjay Dutt Sara Ali Khan Shah Rukh Khan Shahid Kapoor Shilpa Shetty Shraddha Kapoor Shreyas Talpade Shruti Haasan Sidharth Malhotra Sonakshi Sinha Sonam Kapoor Suniel Shetty Sunny Deol Sushant Singh Rajput Taapsee Pannu Tabu Tamannaah Bhatia Tiger Shroff Tusshar Kapoor Uday Chopra Vaani Kapoor Varun Dhawan Vicky Kaushal Vidya Balan Vivek Oberoi Yami Gautam Zareen Khan

Acknowledgements

Extractor used for fetching images - https://github.com/arthursdays/GoogleImagesDownloader Manually removed the noisy images from search results to keep the best images. I have trained a face recognition system on the following image set by using MTCNN(face detection) + facenet-PyTorch(feature extraction) architecture from ArsFutura - https://github.com/arsfutura/face-recognition Their Blog - https://arsfutura.co/magazine/face-recognition-with-facenet-and-mtcnn/
h
face-celeb-vietnamese
huggingface.co
Updated May 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FPTU DSC (2023). face-celeb-vietnamese [Dataset]. https://huggingface.co/datasets/fptudsc/face-celeb-vietnamese
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 4, 2023
Dataset authored and provided by
FPTU DSC
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset Card for "face-celeb-vietnamese"

Dataset Summary

This dataset contains information on over 8,000 samples of well-known Vietnamese individuals, categorized into three professions: singers, actors, and beauty queens. The dataset includes data on more than 100 celebrities in each of the three job categories.

Languages

Vietnamese: The label is used to indicate the name of celebrities in Vietnamese.

Dataset Structure

The image and Vietnamese… See the full description on the dataset page: https://huggingface.co/datasets/fptudsc/face-celeb-vietnamese.
T
celeb_a
tensorflow.org
datasetninja.com
+3more
Updated Jun 1, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). celeb_a [Dataset]. https://www.tensorflow.org/datasets/catalog/celeb_a
Explore at:
Dataset updated
Jun 1, 2024
Description
CelebFaces Attributes Dataset (CelebA) is a large-scale face attributes dataset with more than 200K celebrity images, each with 40 attribute annotations. The images in this dataset cover large pose variations and background clutter. CelebA has large diversities, large quantities, and rich annotations, including - 10,177 number of identities, - 202,599 number of face images, and - 5 landmark locations, 40 binary attributes annotations per image.

The dataset can be employed as the training and test sets for the following computer vision tasks: face attribute recognition, face detection, and landmark (or facial part) localization.

Note: CelebA dataset may contain potential bias. The fairness indicators example goes into detail about several considerations to keep in mind while using the CelebA dataset.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('celeb_a', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/celeb_a-2.1.0.png" alt="Visualization" width="500px">
h
tmdb-celeb-10k
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ismail Ashraq, tmdb-celeb-10k [Dataset]. https://huggingface.co/datasets/ashraq/tmdb-celeb-10k
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Authors
Ismail Ashraq
Description
ashraq/tmdb-celeb-10k dataset hosted on Hugging Face and contributed by the HF Datasets community
h
celebrity_dataset
huggingface.co
Updated Feb 8, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hideki Okamura (2024). celebrity_dataset [Dataset]. https://huggingface.co/datasets/ares1123/celebrity_dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 8, 2024
Authors
Hideki Okamura
Description
Celebrity 1000

Top 1000 celebrities. 18,184 images. 256x256. Square cropped to face.
Supporting datasets PubFig05 for: "Heterogeneous Ensemble Combination Search...
zenodo.org
explore.openaire.eu
+1more
application/gzip, bin
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohammad Nazmul Haque; Nasimul Noman; Regina Berratta; Pablo Moscato; Mohammad Nazmul Haque; Nasimul Noman; Regina Berratta; Pablo Moscato (2020). Supporting datasets PubFig05 for: "Heterogeneous Ensemble Combination Search using Genetic Algorithm for Class Imbalanced Data Classification" [Dataset]. http://doi.org/10.5281/zenodo.33539
Explore at:
application/gzip, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.33539
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Mohammad Nazmul Haque; Nasimul Noman; Regina Berratta; Pablo Moscato; Mohammad Nazmul Haque; Nasimul Noman; Regina Berratta; Pablo Moscato
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Faces Dataset: PubFig05

This is a subset of the ''PubFig83'' dataset [1] which provides 100 images each of 5 most difficult celebrities to recognise (referred as class in the classification problem). For each celebrity persons, we took 100 images and separated them into training and testing sets of 90 and 10 images, respectively:

Person: Jenifer Lopez; Katherine Heigl; Scarlett Johansson; Mariah Carey; Jessica Alba

Feature Extraction

To extract features from images, we have applied the HT-L3-model as described in [2] and obtained 25600 features.

Feature Selection

Details about feature selection followed in brief as follows:

Entropy Filtering: First we apply an implementation of Fayyad and Irani's [3] entropy base heuristic to discretise the dataset and discarded features using the minimum description length (MDL) principle and only 4878 passed this entropy based filtering method.

Class-Distribution Balancing: Next, we have converted the dataset to binary-class problem by separating into 5 binary-class datasets using one-vs-all setup. Hence, these datasets became imbalanced at a ratio of 1:4. Then we converted them into balanced binary-class datasets using random sub-sampled method. Further processing of the dataset has been described in the paper.

(alpha,beta)-k Feature selection: To get a good feature set for training the classifier, we select the features using the approach based on the (alpha,beta)-k feature selection [4] problem. It selects a minimum subset of features that maximise both within class similarity and dissimilarity in different classes. We applied the entropy filtering and (alpha,beta)-k feature subset selection methods in three ways and obtained different numbers of features (in the Table below) after consolidating them into binary class dataset.

UAB: We applied (alpha,beta)-k feature set method on each of the balanced binary-class datasets and we took the union of selected features for each binary-class datasets. Finally, we applied the (alpha,beta)-k feature set selection method on each of the binary-class datasets and get a set of features.

IAB: We applied (alpha,beta)-k feature set method on each of the balanced binary-class datasets and we took the intersection of selected features for each binary-class datasets. Finally, we applied the (alpha,beta)-k feature set selection method on each of the binary-class datasets and get a set of features.

UEAB: We applied (alpha,beta)-k feature set method on each of the balanced binary-class datasets. Then, we applied the entropy filtering and (alpha,beta)-k feature set selection method on each of the balanced binary-class datasets. Finally, we took the union of selected features for each balanced binary-class datasets and get a set of features.

All of these datasets are inside the compressed folder. It also contains the document describing the process detail.

References

[1] Pinto, N., Stone, Z., Zickler, T., & Cox, D. (2011). Scaling up biologically-inspired computer vision: A case study in unconstrained face recognition on facebook. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2011 IEEE Computer Society Conference on (pp. 35–42).

[2] Cox, D., & Pinto, N. (2011). Beyond simple features: A large-scale feature search approach to unconstrained face recognition. In Automatic Face Gesture Recognition and Workshops (FG 2011), 2011 IEEE International Conference on (pp. 8–15).

[3] Fayyad, U. M., & Irani, K. B. (1993). Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning. In International Joint Conference on Artificial Intelligence (pp. 1022–1029).

[4] Berretta, R., Mendes, A., & Moscato, P. (2005). Integer programming models and algorithms for molecular classification of cancer from microarray data. In Proceedings of the Twenty-eighth Australasian conference on Computer Science - Volume 38 (pp. 361–370). 1082201: Australian Computer Society, Inc.
Leading celebrities India 2023, by brand value
statista.com
Updated Jul 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Leading celebrities India 2023, by brand value [Dataset]. https://www.statista.com/statistics/1359798/india-celebrity-ranking-by-brand-value/
Explore at:
Dataset updated
Jul 11, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2023
Area covered
India
Description
Indian cricketer, Virat Kohli, emerged as the leading celebrity with the highest brand value across India, grossing over *** million U.S. dollars in 2023. Bollywood actor Ranveer Singh was the second most-valued celebrity in the South Asian country, with a brand value of about *** million U.S. dollars that year. Celebrity endorsers are a key element of marketing campaigns among companies seeking mass brand awareness in India. Celebrity endorsement Featuring celebrities in advertising has long been a powerful marketing strategy in India, where fans often idolize their favorite sports and on-screen stars. This strong emotional connection is leveraged by brands to enhance their market presence. In 2023, the leading 10 celebrities collectively endorsed over *** products, reflecting a growth from the previous year. These high-profile endorsements boost consumer trust and significantly drive brand visibility in an increasingly competitive marketplace. Brands also strategically partner with celebrities to reach their target audience, with emerging Gen Z icons like Sara Ali Khan and Shubman Gill gaining popularity as brand endorsers. Influence of Indian celebrities Beyond their professional careers, top celebrities in India play a critical role in influencing consumer choices through brand endorsements. Their immense popularity and reach, particularly on social media platforms, provide brands with direct access to vast and diverse audiences. A 2022 survey revealed that ** percent of Indian consumers were more likely to try a product recommended by their favorite influencer, highlighting the role celebrities play in shaping purchase decisions through their endorsements.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Shanmukh (2021). Faces Dataset all at one place [Dataset]. https://www.kaggle.com/datasets/shanmukh05/vggface-using-tripletloss/versions/18

Faces Dataset all at one place

Datasets for Face Detection and Face Recognition

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Feb 24, 2021

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Shanmukh

License

Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically

Description

Context

As name of dataset says, this dataset contains the variety of face datasets available.

Content

CFP data folder: This folder consists of around 5000 images distributed among 500 persons (10 each). source celebs folder This folder contains images 100 bollywood actors. A total of 10029 images are present. source images resolute This folder contains images of over 664 persons across the world. (Approximately size is 1.3GB ) dataset folder This folder consists low resolution images of 158 persons. crop faces folder This folder contains cropped faces of dataset folder. Cropping is done with MTCNN library.

vgg face weights h5 file Pretrained weights of VGG Facenet model. For more details visit VGG face recognition

Clear search

Close search

Google apps

Main menu

Faces Dataset all at one place

Context

Content

Bollywood Celebrity Faces

Context

Content

Acknowledgements

face-celeb-vietnamese

celeb_a

tmdb-celeb-10k

celebrity_dataset

Supporting datasets PubFig05 for: "Heterogeneous Ensemble Combination Search...

Leading celebrities India 2023, by brand value

Faces Dataset all at one place

Datasets for Face Detection and Face Recognition

Context

Content