8 datasets found
  1. Faces Dataset all at one place

    • kaggle.com
    Updated Feb 24, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shanmukh (2021). Faces Dataset all at one place [Dataset]. https://www.kaggle.com/datasets/shanmukh05/vggface-using-tripletloss/versions/18
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 24, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Shanmukh
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Context

    As name of dataset says, this dataset contains the variety of face datasets available.

    Content

    CFP data folder: This folder consists of around 5000 images distributed among 500 persons (10 each). source celebs folder This folder contains images 100 bollywood actors. A total of 10029 images are present. source images resolute This folder contains images of over 664 persons across the world. (Approximately size is 1.3GB ) dataset folder This folder consists low resolution images of 158 persons. crop faces folder This folder contains cropped faces of dataset folder. Cropping is done with MTCNN library.

    vgg face weights h5 file Pretrained weights of VGG Facenet model. For more details visit VGG face recognition

  2. Bollywood Celebrity Faces

    • kaggle.com
    Updated Mar 21, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rajesh Kumar (2020). Bollywood Celebrity Faces [Dataset]. https://www.kaggle.com/havingfun/100-bollywood-celebrity-faces/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 21, 2020
    Dataset provided by
    Kaggle
    Authors
    Rajesh Kumar
    Description

    Context

    While learning different Deep learning techniques and trying different existing models, I thought of creating an app to find "Which celeb you look like" app using feature extractions and matching. I used these images to train the model. All these images are downloaded using bing image downloader.

    Content

    Fetched the list of top 100 celebrities from - https://www.bollywoodhungama.com/celebrities/top-100/

    List of the celebs - Aamir Khan Abhay Deol Abhishek Bachchan Aftab Shivdasani Aishwarya Rai Ajay Devgn Akshay Kumar Akshaye Khanna Alia Bhatt Ameesha Patel Amitabh Bachchan Amrita Rao Amy Jackson Anil Kapoor Anushka Sharma Anushka Shetty Arjun Kapoor Arjun Rampal Arshad Warsi Asin Ayushmann Khurrana Bhumi Pednekar Bipasha Basu Bobby Deol Deepika Padukone Disha Patani Emraan Hashmi Esha Gupta Farhan Akhtar Govinda Hrithik Roshan Huma Qureshi Ileana D’Cruz Irrfan Khan Jacqueline Fernandez John Abraham Juhi Chawla Kajal Aggarwal Kajol Kangana Ranaut Kareena Kapoor Karisma Kapoor Kartik Aaryan Katrina Kaif Kiara Advani Kriti Kharbanda Kriti Sanon Kunal Khemu Lara Dutta Madhuri Dixit Manoj Bajpayee Mrunal Thakur Nana Patekar Nargis Fakhri Naseeruddin Shah Nushrat Bharucha Paresh Rawal Parineeti Chopra Pooja Hegde Prabhas Prachi Desai Preity Zinta Priyanka Chopra R Madhavan Rajkummar Rao Ranbir Kapoor Randeep Hooda Rani Mukerji Ranveer Singh Richa Chadda Riteish Deshmukh Saif Ali Khan Salman Khan Sanjay Dutt Sara Ali Khan Shah Rukh Khan Shahid Kapoor Shilpa Shetty Shraddha Kapoor Shreyas Talpade Shruti Haasan Sidharth Malhotra Sonakshi Sinha Sonam Kapoor Suniel Shetty Sunny Deol Sushant Singh Rajput Taapsee Pannu Tabu Tamannaah Bhatia Tiger Shroff Tusshar Kapoor Uday Chopra Vaani Kapoor Varun Dhawan Vicky Kaushal Vidya Balan Vivek Oberoi Yami Gautam Zareen Khan

    Acknowledgements

    Extractor used for fetching images - https://github.com/arthursdays/GoogleImagesDownloader Manually removed the noisy images from search results to keep the best images. I have trained a face recognition system on the following image set by using MTCNN(face detection) + facenet-PyTorch(feature extraction) architecture from ArsFutura - https://github.com/arsfutura/face-recognition Their Blog - https://arsfutura.co/magazine/face-recognition-with-facenet-and-mtcnn/

  3. h

    face-celeb-vietnamese

    • huggingface.co
    Updated May 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FPTU DSC (2023). face-celeb-vietnamese [Dataset]. https://huggingface.co/datasets/fptudsc/face-celeb-vietnamese
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 4, 2023
    Dataset authored and provided by
    FPTU DSC
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset Card for "face-celeb-vietnamese"

      Dataset Summary
    

    This dataset contains information on over 8,000 samples of well-known Vietnamese individuals, categorized into three professions: singers, actors, and beauty queens. The dataset includes data on more than 100 celebrities in each of the three job categories.

      Languages
    

    Vietnamese: The label is used to indicate the name of celebrities in Vietnamese.

      Dataset Structure
    

    The image and Vietnamese… See the full description on the dataset page: https://huggingface.co/datasets/fptudsc/face-celeb-vietnamese.

  4. T

    celeb_a

    • tensorflow.org
    • datasetninja.com
    • +3more
    Updated Jun 1, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). celeb_a [Dataset]. https://www.tensorflow.org/datasets/catalog/celeb_a
    Explore at:
    Dataset updated
    Jun 1, 2024
    Description

    CelebFaces Attributes Dataset (CelebA) is a large-scale face attributes dataset with more than 200K celebrity images, each with 40 attribute annotations. The images in this dataset cover large pose variations and background clutter. CelebA has large diversities, large quantities, and rich annotations, including - 10,177 number of identities, - 202,599 number of face images, and - 5 landmark locations, 40 binary attributes annotations per image.

    The dataset can be employed as the training and test sets for the following computer vision tasks: face attribute recognition, face detection, and landmark (or facial part) localization.

    Note: CelebA dataset may contain potential bias. The fairness indicators example goes into detail about several considerations to keep in mind while using the CelebA dataset.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('celeb_a', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/celeb_a-2.1.0.png" alt="Visualization" width="500px">

  5. h

    tmdb-celeb-10k

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ismail Ashraq, tmdb-celeb-10k [Dataset]. https://huggingface.co/datasets/ashraq/tmdb-celeb-10k
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Ismail Ashraq
    Description

    ashraq/tmdb-celeb-10k dataset hosted on Hugging Face and contributed by the HF Datasets community

  6. h

    celebrity_dataset

    • huggingface.co
    Updated Feb 8, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hideki Okamura (2024). celebrity_dataset [Dataset]. https://huggingface.co/datasets/ares1123/celebrity_dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 8, 2024
    Authors
    Hideki Okamura
    Description

    Celebrity 1000

    Top 1000 celebrities. 18,184 images. 256x256. Square cropped to face.

  7. Supporting datasets PubFig05 for: "Heterogeneous Ensemble Combination Search...

    • zenodo.org
    • explore.openaire.eu
    • +1more
    application/gzip, bin
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohammad Nazmul Haque; Nasimul Noman; Regina Berratta; Pablo Moscato; Mohammad Nazmul Haque; Nasimul Noman; Regina Berratta; Pablo Moscato (2020). Supporting datasets PubFig05 for: "Heterogeneous Ensemble Combination Search using Genetic Algorithm for Class Imbalanced Data Classification" [Dataset]. http://doi.org/10.5281/zenodo.33539
    Explore at:
    application/gzip, binAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Mohammad Nazmul Haque; Nasimul Noman; Regina Berratta; Pablo Moscato; Mohammad Nazmul Haque; Nasimul Noman; Regina Berratta; Pablo Moscato
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Faces Dataset: PubFig05

    This is a subset of the ''PubFig83'' dataset [1] which provides 100 images each of 5 most difficult celebrities to recognise (referred as class in the classification problem). For each celebrity persons, we took 100 images and separated them into training and testing sets of 90 and 10 images, respectively:

    Person: Jenifer Lopez; Katherine Heigl; Scarlett Johansson; Mariah Carey; Jessica Alba

    Feature Extraction

    To extract features from images, we have applied the HT-L3-model as described in [2] and obtained 25600 features.

    Feature Selection

    Details about feature selection followed in brief as follows:

    1. Entropy Filtering: First we apply an implementation of Fayyad and Irani's [3] entropy base heuristic to discretise the dataset and discarded features using the minimum description length (MDL) principle and only 4878 passed this entropy based filtering method.

    2. Class-Distribution Balancing: Next, we have converted the dataset to binary-class problem by separating into 5 binary-class datasets using one-vs-all setup. Hence, these datasets became imbalanced at a ratio of 1:4. Then we converted them into balanced binary-class datasets using random sub-sampled method. Further processing of the dataset has been described in the paper.

    3. (alpha,beta)-k Feature selection: To get a good feature set for training the classifier, we select the features using the approach based on the (alpha,beta)-k feature selection [4] problem. It selects a minimum subset of features that maximise both within class similarity and dissimilarity in different classes. We applied the entropy filtering and (alpha,beta)-k feature subset selection methods in three ways and obtained different numbers of features (in the Table below) after consolidating them into binary class dataset.

    • UAB: We applied (alpha,beta)-k feature set method on each of the balanced binary-class datasets and we took the union of selected features for each binary-class datasets. Finally, we applied the (alpha,beta)-k feature set selection method on each of the binary-class datasets and get a set of features.

    • IAB: We applied (alpha,beta)-k feature set method on each of the balanced binary-class datasets and we took the intersection of selected features for each binary-class datasets. Finally, we applied the (alpha,beta)-k feature set selection method on each of the binary-class datasets and get a set of features.

    • UEAB: We applied (alpha,beta)-k feature set method on each of the balanced binary-class datasets. Then, we applied the entropy filtering and (alpha,beta)-k feature set selection method on each of the balanced binary-class datasets. Finally, we took the union of selected features for each balanced binary-class datasets and get a set of features.

    All of these datasets are inside the compressed folder. It also contains the document describing the process detail.

    References

    [1] Pinto, N., Stone, Z., Zickler, T., & Cox, D. (2011). Scaling up biologically-inspired computer vision: A case study in unconstrained face recognition on facebook. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2011 IEEE Computer Society Conference on (pp. 35–42).

    [2] Cox, D., & Pinto, N. (2011). Beyond simple features: A large-scale feature search approach to unconstrained face recognition. In Automatic Face Gesture Recognition and Workshops (FG 2011), 2011 IEEE International Conference on (pp. 8–15).

    [3] Fayyad, U. M., & Irani, K. B. (1993). Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning. In International Joint Conference on Artificial Intelligence (pp. 1022–1029).

    [4] Berretta, R., Mendes, A., & Moscato, P. (2005). Integer programming models and algorithms for molecular classification of cancer from microarray data. In Proceedings of the Twenty-eighth Australasian conference on Computer Science - Volume 38 (pp. 361–370). 1082201: Australian Computer Society, Inc.

  8. Leading celebrities India 2023, by brand value

    • statista.com
    Updated Jul 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Leading celebrities India 2023, by brand value [Dataset]. https://www.statista.com/statistics/1359798/india-celebrity-ranking-by-brand-value/
    Explore at:
    Dataset updated
    Jul 11, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2023
    Area covered
    India
    Description

    Indian cricketer, Virat Kohli, emerged as the leading celebrity with the highest brand value across India, grossing over *** million U.S. dollars in 2023. Bollywood actor Ranveer Singh was the second most-valued celebrity in the South Asian country, with a brand value of about *** million U.S. dollars that year. Celebrity endorsers are a key element of marketing campaigns among companies seeking mass brand awareness in India. Celebrity endorsement Featuring celebrities in advertising has long been a powerful marketing strategy in India, where fans often idolize their favorite sports and on-screen stars. This strong emotional connection is leveraged by brands to enhance their market presence. In 2023, the leading 10 celebrities collectively endorsed over *** products, reflecting a growth from the previous year. These high-profile endorsements boost consumer trust and significantly drive brand visibility in an increasingly competitive marketplace. Brands also strategically partner with celebrities to reach their target audience, with emerging Gen Z icons like Sara Ali Khan and Shubman Gill gaining popularity as brand endorsers. Influence of Indian celebrities Beyond their professional careers, top celebrities in India play a critical role in influencing consumer choices through brand endorsements. Their immense popularity and reach, particularly on social media platforms, provide brands with direct access to vast and diverse audiences. A 2022 survey revealed that ** percent of Indian consumers were more likely to try a product recommended by their favorite influencer, highlighting the role celebrities play in shaping purchase decisions through their endorsements.

  9. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Shanmukh (2021). Faces Dataset all at one place [Dataset]. https://www.kaggle.com/datasets/shanmukh05/vggface-using-tripletloss/versions/18
Organization logo

Faces Dataset all at one place

Datasets for Face Detection and Face Recognition

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 24, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Shanmukh
License

Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically

Description

Context

As name of dataset says, this dataset contains the variety of face datasets available.

Content

CFP data folder: This folder consists of around 5000 images distributed among 500 persons (10 each). source celebs folder This folder contains images 100 bollywood actors. A total of 10029 images are present. source images resolute This folder contains images of over 664 persons across the world. (Approximately size is 1.3GB ) dataset folder This folder consists low resolution images of 158 persons. crop faces folder This folder contains cropped faces of dataset folder. Cropping is done with MTCNN library.

vgg face weights h5 file Pretrained weights of VGG Facenet model. For more details visit VGG face recognition

Search
Clear search
Close search
Google apps
Main menu