28 datasets found

FaciaVox a Multimodal Biometric Dataset
zenodo.org
Updated Feb 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kamal Abuqaaud; Ali Bou Nassif; Ismail Shahin; Kamal Abuqaaud; Ali Bou Nassif; Ismail Shahin (2025). FaciaVox a Multimodal Biometric Dataset [Dataset]. http://doi.org/10.5281/zenodo.14861092
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.14861092
Dataset updated
Feb 13, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Kamal Abuqaaud; Ali Bou Nassif; Ismail Shahin; Kamal Abuqaaud; Ali Bou Nassif; Ismail Shahin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The FaciaVox dataset is an extensive multimodal biometric resource designed to enable in-depth exploration of face-image and voice recording research areas in both masked and unmasked scenarios.

Features of the Dataset:

1. Multimodal Data: A total of 1,800 face images (JPG) and 6,000 audio recordings (WAV) were collected, enabling cross-domain analysis of visual and auditory biometrics.

2. Participants were categorized into four age groups for structured labeling:
Label 1: Under 16 years
Label 2: 16 to less than 31 years
Label 3: 31 to less than 46 years
Label 4: 46 years and above

3. Sibling Data: Some participants are siblings, adding a challenging layer for speaker identification and facial recognition tasks due to genetic similarities in vocal and facial features. Sibling relationships are documented in the accompanying "FaciaVox List" data file.

4. Standardized Filenames: The dataset uses a consistent, intuitive naming convention for both facial images and voice recordings. Each filename includes:
Type (F: Face Image, V: Voice Recording)
Participant ID (e.g., sub001)
Mask Type (e.g., a: unmasked, b: disposable mask, etc.)
Zoom Level or Sentence ID (e.g., 1x, 3x, 5x for images or specific sentence identifier {01, 02, 03, ..., 10} for recordings)

5. Diverse Demographics: 19 different countries.

6. A challenging face recognition problem involving reflective mask shields and severe lighting conditions.

7. Each participant uttered 7 English statements and 3 Arabic statements, regardless of their native language. This adds a challenge for speaker identification.

Research Applications

FaciaVox is a versatile dataset supporting a wide range of research domains, including but not limited to:
• Speaker Identification (SI) and Face Recognition (FR): Evaluating biometric systems under varying conditions.
• Impact of Masks on Biometrics: Investigating how different facial coverings affect recognition performance.
• Language Impact on SI: Exploring the effects of native and non-native speech on speaker identification.
• Age and Gender Estimation: Inferring demographic information from voice and facial features.
• Race and Ethnicity Matching: Studying biometrics across diverse populations.
• Synthetic Voice and Deepfake Detection: Detecting cloned or generated speech.
• Cross-Domain Biometric Fusion: Combining facial and vocal data for robust authentication.
• Speech Intelligibility: Assessing how masks influence speech clarity.
• Image Inpainting: Reconstructing occluded facial regions for improved recognition.

Researchers can use the facial images and voice recordings independently or in combination to explore multimodal biometric systems. The standardized filenames and accompanying metadata make it easy to align visual and auditory data for cross-domain analyses. Sibling relationships and demographic labels add depth for tasks such as familial voice recognition, demographic profiling, and model bias evaluation.
Heartprint: A Multisession ECG Dataset for Biometric Recognition
figshare.com
xlsx
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Md Saiful Islam; Haikel AlHichri; Yakoub Bazi; Nassim Ammour; Naif Alajlan; Rami M. Jomaa (2023). Heartprint: A Multisession ECG Dataset for Biometric Recognition [Dataset]. http://doi.org/10.6084/m9.figshare.20105354.v3
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.20105354.v3
Dataset updated
May 30, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Md Saiful Islam; Haikel AlHichri; Yakoub Bazi; Nassim Ammour; Naif Alajlan; Rami M. Jomaa
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Heartprint is a large biometric database of multisession ECG-signal comprising 1539 records of 15 seconds length collected from 199 subjects. The signals were collected in multiple sessions over ten years starting from 2012 in resting and reading conditions of the subjects and organized in a multisession database. The dataset also covers several demographic classes such as genders, ethnicities, and age groups. Heartprint dataset could be a valuable resource for the development and evaluation of biometric recognition algorithms.

Please cite the following article if you use this dataset: Islam, M.S.; Alhichri, H.; Bazi, Y.; Ammour, N.; Alajlan, N.; Jomaa, R.M. Heartprint: A Dataset of Multisession ECG Signal with Long Interval Captured from Fingers for Biometric Recognition. Data 2022, 7, 141. https://doi.org/10.3390/data7100141
D
Gait Biometrics Market Report | Global Forecast From 2025 To 2033
dataintelo.com
csv, pdf, pptx
Updated Sep 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataintelo (2024). Gait Biometrics Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/gait-biometrics-market
Explore at:
csv, pdf, pptxAvailable download formats
Dataset updated
Sep 22, 2024
Dataset authored and provided by
Dataintelo
License
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
Time period covered
2024 - 2032
Area covered
Global
Description
Gait Biometrics Market Outlook

The global gait biometrics market size was valued at USD 0.9 billion in 2023 and is expected to reach approximately USD 2.5 billion by 2032, growing at a compound annual growth rate (CAGR) of 11.5% during the forecast period. This impressive growth is driven by advancements in sensor technology, increasing adoption in healthcare and security applications, and rising awareness of the benefits of gait biometrics in various sectors.

One of the primary growth factors of the gait biometrics market is the increasing prevalence of chronic diseases and conditions that impair mobility, such as Parkinson’s disease, arthritis, and stroke. These conditions necessitate advanced diagnostic and monitoring tools, propelling the demand for gait biometrics in healthcare settings. Healthcare professionals are increasingly utilizing gait analysis for early diagnosis, rehabilitation, and treatment effectiveness, which is significantly boosting market growth.

Technological advancements in the development of sophisticated sensors and machine learning algorithms are also major drivers of the gait biometrics market. Enhanced accuracy and reliability of the data collected through wearable and non-wearable sensors have broadened the scope of gait biometrics applications. Integration with artificial intelligence allows for more precise analysis and predictions, making gait biometrics a valuable tool in both clinical and non-clinical settings.

Security and surveillance are other areas witnessing significant adoption of gait biometrics. As global security concerns rise, there is growing interest in non-invasive and unobtrusive identification methods. Gait biometrics offers a unique advantage in this aspect, as gait patterns are difficult to disguise or replicate. Governments and private security agencies are increasingly investing in gait biometric systems for enhanced security measures, thus fostering market expansion.

Regionally, North America and Europe are the leading markets for gait biometrics due to the presence of advanced healthcare infrastructure, high adoption of innovative technologies, and a strong focus on research and development. However, the Asia Pacific region is expected to exhibit the highest growth rate during the forecast period, fueled by increasing healthcare investments and rising awareness about biometric technologies.

Component Analysis

The gait biometrics market by component is segmented into hardware, software, and services. The hardware segment includes sensors, cameras, and other physical devices used in capturing gait data. This segment is experiencing robust growth due to continuous advancements in sensor technology, providing higher accuracy and reliability in data collection. Moreover, the decreasing cost of sensors is making hardware more accessible to a broader range of applications, including consumer health and fitness devices.

Software is another crucial component of the gait biometrics market, encompassing gait analysis software and machine learning algorithms that process and interpret the collected data. This segment is expected to witness significant growth due to the increasing demand for sophisticated software solutions that can handle large datasets and provide detailed analysis. The integration of cutting-edge technologies such as artificial intelligence and cloud computing is further enhancing the capabilities of gait biometrics software.

Services form an integral part of the gait biometrics ecosystem, including installation, maintenance, and training services. As gait biometrics systems become more complex, the demand for specialized services is rising. Service providers play a crucial role in ensuring the smooth operation and optimal performance of gait biometrics systems. The increasing adoption of these systems across various sectors is driving the demand for comprehensive service packages.

The synergetic interaction between hardware and software components is vital for the effective functioning of gait biometrics systems. Advanced hardware captures high-quality data, while sophisticated software processes this data to generate actionable insights. This seamless integration is essential for the widespread adoption and success of gait biometrics technology across different applications.

Report Scope

<tbody
m
Behaviour Biometrics Dataset
data.mendeley.com
Updated Jun 20, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nonso Nnamoko (2022). Behaviour Biometrics Dataset [Dataset]. http://doi.org/10.17632/fnf8b85kr6.1
Explore at:
Unique identifier
https://doi.org/10.17632/fnf8b85kr6.1
Dataset updated
Jun 20, 2022
Authors
Nonso Nnamoko
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset provides a collection of behaviour biometrics data (commonly known as Keyboard, Mouse and Touchscreen (KMT) dynamics). The data was collected for use in a FinTech research project undertaken by academics and researchers at Computer Science Department, Edge Hill University, United Kingdom. The project called CyberSIgnature uses KMT dynamics data to distinguish between legitimate card owners and fraudsters. An application was developed that has a graphical user interface (GUI) similar to a standard online card payment form including fields for card type, name, card number, card verification code (cvc) and expiry date. Then, user KMT dynamics were captured while they entered fictitious card information on the GUI application.

The dataset consists of 1,760 KMT dynamic instances collected over 88 user sessions on the GUI application. Each user session involves 20 iterations of data entry in which the user is assigned a fictitious card information (drawn at random from a pool) to enter 10 times and subsequently presented with 10 additional card information, each to be entered once. The 10 additional card information is drawn from a pool that has been assigned or to be assigned to other users. A KMT data instance is collected during each data entry iteration. Thus, a total of 20 KMT data instances (i.e., 10 legitimate and 10 illegitimate) was collected during each user entry session on the GUI application.

The raw dataset is stored in .json format within 88 separate files. The root folder named behaviour_biometrics_dataset' consists of two sub-foldersraw_kmt_dataset' and `feature_kmt_dataset'; and a Jupyter notebook file (kmt_feature_classificatio.ipynb). Their folder and file content is described below:

-- raw_kmt_dataset': this folder contains 88 files, each namedraw_kmt_user_n.json', where n is a number from 0001 to 0088. Each file contains 20 instances of KMT dynamics data corresponding to a given fictitious card; and the data instances are equally split between legitimate (n = 10) and illegitimate (n = 10) classes. The legitimate class corresponds to KMT dynamics captured from the user that is assigned to the card detail; while the illegitimate class corresponds to KMT dynamics data collected from other users entering the same card detail.

-- feature_kmt_dataset': this folder contains two sub-folders, namely:feature_kmt_json' and feature_kmt_xlsx'. Each folder contains 88 files (of the relevant format: .json or .xlsx) , each namedfeature_kmt_user_n', where n is a number from 0001 to 0088. Each file contains 20 instances of features extracted from the corresponding `raw_kmt_user_n' file including the class labels (legitimate = 1 or illegitimate = 0).

-- `kmt_feature_classification.ipynb': this file contains python code necessary to generate features from the raw KMT files and apply simple machine learning classification task to generate results. The code is designed to run with minimal effort from the user.
BED: Biometric EEG dataset
zenodo.org
producciocientifica.uv.es
Updated Apr 20, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pablo Arnau-González; Pablo Arnau-González; Stamos Katsigiannis; Stamos Katsigiannis; Miguel Arevalillo-Herráez; Miguel Arevalillo-Herráez; Naeem Ramzan; Naeem Ramzan (2022). BED: Biometric EEG dataset [Dataset]. http://doi.org/10.5281/zenodo.4309472
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.4309472
Dataset updated
Apr 20, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Pablo Arnau-González; Pablo Arnau-González; Stamos Katsigiannis; Stamos Katsigiannis; Miguel Arevalillo-Herráez; Miguel Arevalillo-Herráez; Naeem Ramzan; Naeem Ramzan
Description
The BED dataset

Version 1.0.0

Please cite as: Arnau-González, P., Katsigiannis, S., Arevalillo-Herráez, M., Ramzan, N., "BED: A new dataset for EEG-based biometrics", IEEE Internet of Things Journal, vol. 8, no. 15, pp. 12219 - 12230, 2021.

Disclaimer

While every care has been taken to ensure the accuracy of the data included in the BED dataset, the authors and the University of the West of Scotland, Durham University, and Universitat de València do not provide any guaranties and disclaim all responsibility and all liability (including without limitation, liability in negligence) for all expenses, losses, damages (including indirect or consequential damage) and costs which you might incur as a result of the provided data being inaccurate or incomplete in any way and for any reason. 2020, University of the West of Scotland, Scotland, United Kingdom.

Contact

For inquiries regarding the BED dataset, please contact:

Dr Pablo Arnau-González, arnau.pablo [*AT*] gmail.com

Dr Stamos Katsigiannis, stamos.katsigiannis [*AT*] durham.ac.uk

Prof. Miguel Arevalillo-Herráez, miguel.arevalillo [*AT*] uv.es

Prof. Naeem Ramzan, Naeem.Ramzan [*AT*] uws.ac.uk

Dataset summary

BED (Biometric EEG Dataset) is a dataset specifically designed to test EEG-based biometric approaches that use relatively inexpensive consumer-grade devices, more specifically the Emotiv EPOC+ in this case. This dataset includes EEG responses from 21 subjects to 12 different stimuli, across 3 different chronologically disjointed sessions. We have also considered stimuli aimed to elicit different affective states, so as to facilitate future research on the influence of emotions on EEG-based biometric tasks. In addition, we provide a baseline performance analysis to outline the potential of consumer-grade EEG devices for subject identification and verification. It must be noted that, in this work, EEG data were acquired in a controlled environment in order to reduce the variability in the acquired data stemming from external conditions.

The stimuli include:

Images selected to elicit specific emotions

Mathematical computations (2-digit additions)

Resting-state with eyes closed

Resting-state with eyes open

Visual Evoked Potentials at 2, 5, 7, 10 Hz - Standard checker-board pattern with pattern reversal

Visual Evoked Potentials at 2, 5, 7, 10 Hz - Flashing with a plain colour, set as black

For more details regarding the experimental protocol and the design of the dataset, please refer to the associated publication: Arnau-González, P., Katsigiannis, S., Arevalillo-Herráez, M., Ramzan, N., "BED: A new dataset for EEG-based biometrics", IEEE Internet of Things Journal, 2021. (Under review)

Dataset structure and contents

The BED dataset contains EEG recordings from 21 subjects, acquired during 3 similar sessions for each subject. The sessions were spaced one week apart from each other.

The BED dataset includes:

The raw EEG recordings with no pre-processing and the log files of the experimental procedure, in text format

The EEG recordings with no pre-processing, segmented, structured and annotated according to the presented stimuli, in Matlab format

The features extracted from each EEG segment, as described in the associated publication

The dataset is organised in 3 folders:

RAW

RAW_PARSED

Features

RAW/ Contains the RAW files
RAW/sN/ Contains the RAW files associated with subject N
Each folder sN is composed by the following files:
- sN_s1.csv, sN_s2.csv, sN_s3.csv -- Files containing the EEG recordings for subject N and session 1, 2, and 3, respectively. These files contain 39 columns:
COUNTER INTERPOLATED F3 FC5 AF3 F7 T7 P7 O1 O2 P8 T8 F8 AF4 FC6 F4 ...UNUSED DATA... UNIX_TIMESTAMP
- subject_N_session_1_time_X.log, subject_N_session_2_time_X.log, subject_N_session_3_time_X.log -- Log files containing the sequence of events for the subject N and the session 1,2, and 3 respectively.

RAW_PARSED/
Contains Matlab files named sN_sM.mat. The files contain the recordings for the subject N in the session M. These files are composed by two variables:
- recording: size (time@256Hz x 17), Columns: COUNTER INTERPOLATED F3 FC5 AF3 F7 T7 P7 O1 O2 P8 T8 F8 AF4 FC6 F4 UNIX_TIMESTAMP
- events: cell array with size (events x 3) START_UNIX END_UNIX ADDITIONAL_INFO
START_UNIX is the UNIX timestamp in which the event starts
END_UNIX is the UNIX timestamp in which the event ends
ADDITIONAL INFO contains a struct with additional information regarding the specific event, in the case of the images, the expected score, the voted score, in the case of the cognitive task the input, in the case of the VEP the pattern and the frequency, etc..

Features/
Features/Identification
Features/Identification/[ARRC|MFCC|SPEC]/: Each of these folders contain the extracted features ready for classification for each of the stimuli, each file is composed by two variables, "feat" the feature matrix and "Y" the label matrix.
- feat: N x number of features
- Y: N x 2 (the #subject and the #session)
- INFO: Contains details about the event same as the ADDITIONAL INFO
Features/Verification: This folder is composed by 3 different files each of them with one different set of features extracted. Each file is composed by one cstruct array composed by:
- data: the time-series features, as described in the paper
- y: the #subject
- stimuli: the stimuli by name
- session: the #session
- INFO: Contains details about the event

The features provided are in sequential order, so index 1 and index 2, etc. are sequential in time if they belong to the same stimulus.

Additional information

For additional information regarding the creation of the BED dataset, please refer to the associated publication: Arnau-González, P., Katsigiannis, S., Arevalillo-Herráez, M., Ramzan, N., "BED: A new dataset for EEG-based biometrics", IEEE Internet of Things Journal, vol. 8, no. 15, pp. 12219 - 12230, 2021.
D
Biometrics in Transportation Market Report | Global Forecast From 2025 To...
dataintelo.com
csv, pdf, pptx
Updated Dec 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataintelo (2024). Biometrics in Transportation Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-biometrics-in-transportation-market
Explore at:
csv, pptx, pdfAvailable download formats
Dataset updated
Dec 3, 2024
Dataset authored and provided by
Dataintelo
License
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
Time period covered
2024 - 2032
Area covered
Global
Description
Biometrics in Transportation Market Outlook

The global biometrics in transportation market was valued at approximately USD 4.5 billion in 2023 and is projected to reach USD 12.9 billion by 2032, growing at a compound annual growth rate (CAGR) of 12.3% during the forecast period. The market size growth is being driven by increasing security demands, advancements in biometric technologies, and the integration of AI and IoT in transportation systems. The need for enhanced security measures in public transport systems and border controls has significantly spurred the adoption of biometric solutions, which are renowned for their accuracy and efficiency in identifying individuals.

A prominent growth factor for the biometrics in transportation market is the heightened emphasis on security across global transportation networks. As the frequency and sophistication of security threats increase, transportation authorities are prioritizing the implementation of robust security protocols to safeguard passengers and cargo. Biometric technologies, offering unique identification and verification capabilities, are being increasingly adopted in airports, railways, and other transportation hubs worldwide. This trend is further fueled by government initiatives mandating stringent security measures at critical points of infrastructure, as well as growing public acceptance of biometric solutions as a means of enhancing safety and convenience.

Another significant driver of market growth is the rapid technological advancements in biometric systems. The evolution of technologies such as fingerprint recognition, facial recognition, and iris recognition has led to more accurate, reliable, and user-friendly solutions. These innovations have expanded the potential applications of biometrics within the transportation sector, from securing access to restricted areas to streamlining passenger identification and boarding processes. Furthermore, the integration of artificial intelligence and machine learning into biometric systems has enhanced their capability to process and analyze massive datasets, thereby improving the speed and accuracy of identity verification.

The growing demand for seamless and contactless travel experiences is also contributing to the expansion of the biometrics in transportation market. In the wake of the COVID-19 pandemic, there has been an accelerated shift towards touchless solutions to minimize physical contact and reduce health risks. Biometric technologies are ideally suited to meet this demand, as they enable secure identification without the need for physical interaction. This is particularly relevant in airports, where biometric systems are increasingly being deployed to expedite the check-in, security screening, and boarding processes, thereby enhancing the overall passenger experience.

Regionally, North America holds a significant share of the biometrics in transportation market, driven by a strong focus on advanced security solutions and substantial investments in transportation infrastructure. The region's early adoption of biometric technologies in airports and border control applications has set a benchmark for other regions to follow. However, the Asia Pacific region is expected to witness the highest growth rate over the forecast period, propelled by rapid urbanization, increasing passenger traffic, and government initiatives to enhance transportation and security infrastructure. The growing economies in this region, such as China and India, are investing heavily in modernizing their transportation systems, creating lucrative opportunities for market expansion.

Technology Analysis

The technology segment within the biometrics in transportation market comprises fingerprint recognition, facial recognition, iris recognition, voice recognition, and other emerging technologies. Fingerprint recognition has traditionally been the most widely used biometric technology due to its cost-effectiveness, ease of use, and high accuracy. It is commonly employed in transportation settings for access control and verification purposes. However, its adoption is now giving way to more advanced technologies such as facial and iris recognition, which offer improved security and user experience. These technologies are increasingly being integrated into systems where rapid and contactless verification is a priority, such as airport check-ins and border control points.

Facial recognition technology has gained significant traction in recent years, becoming a preferred choice for many transportation authorities due to its non-intrusive nature and ability to quickly process large
f
BIOMETRIC-DATASET
figshare.com
txt
Updated Jul 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alberto Rubio-López (2024). BIOMETRIC-DATASET [Dataset]. http://doi.org/10.6084/m9.figshare.26215184.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.26215184.v1
Dataset updated
Jul 9, 2024
Dataset provided by
figshare
Authors
Alberto Rubio-López
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Project OverviewThis study aims to assess and compare stress responses in healthcare professionals during simulated pericardiocentesis, utilizing both a conventional mannequin and a virtual reality (VR) environment. Pericardiocentesis, crucial in managing cardiac tamponade, demands precision and decision-making under pressure, potentially inducing significant stress.MethodologyParticipants will perform the procedure in both simulated environments, with biometric indicators such as heart rate, skin conductance, and eletromyography recorded with a biosignal plux device, to evaluate stress intensity. Subjective stress assessments will complement biometric data, with statistical analysis identifying significant differences between simulation methods.SignificanceThe findings will elucidate the effectiveness of VR simulation in medical training compared to traditional mannequin-based methods, potentially guiding training decisions and improving healthcare professionals’ readiness for high-stress situations, thereby enhancing patient safety and treatment efficacy.ConclusionThis research aims to inform on the optimal use of VR technology in medical education and understand the psychological impacts of emergency medical procedures on healthcare providers, supporting technological innovation and psychological insights in medical education.FilesFiles were recorded adquired during the performing of pericardiocentesis in both scenarios using a biosignal plux device. Files are en txt format. This are the first 10 subjects of the project. The entire dataset is of 65 alumni, although some files are recorded sith some errors. If you need the complete dataset, please let me know.Thanks!
Anti-Spoofing Real Videos Dataset
kaggle.com
Updated Jun 10, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Unidata (2025). Anti-Spoofing Real Videos Dataset [Dataset]. https://www.kaggle.com/datasets/unidpro/face-anti-spoofing
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 10, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Unidata
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
Face Antispoofing dataset for recognition systems

The dataset consists of 98,000 videos and selfies from 170 countries, providing a foundation for developing robust security systems and facial recognition algorithms.

While the dataset itself doesn't contain spoofing attacks, it's a valuable resource for testing liveness detection system, allowing researchers to simulate attacks and evaluate how effectively their systems can distinguish between real faces and various forms of spoofing.

By utilizing this dataset, researchers can contribute to the development of advanced security solutions, enabling the safe and reliable use of biometric technologies for authentication and verification. - Get the data

Examples of data

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F22059654%2Fe46e401a5449bacce5f934aaea9bb06e%2FFrame%20155.png?generation=1730591437955112&alt=media" alt=""> The dataset offers a high-quality collection of videos and photos, including selfies taken with a range of popular smartphones, like iPhone, Xiaomi, Samsung, and more. The videos showcase individuals turning their heads in various directions, providing a natural range of movements for liveness detection training.

💵 Buy the Dataset: This is a limited preview of the data. To access the full dataset, please contact us at https://unidata.pro to discuss your requirements and pricing options.

Metadata for the dataset

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F22059654%2F8350718e93ee92840995405815739c61%2FFrame%20136%20(1).png?generation=1730591760432249&alt=media" alt=""> Furthermore, the dataset provides detailed metadata for each set, including information like gender, age, ethnicity, video resolution, duration, and frames per second. This rich metadata provides crucial context for analysis and model development.

Researchers can develop more accurate liveness detection algorithms, which is crucial for achieving the iBeta Level 2 certification, a benchmark for robust and reliable biometric systems that prevent fraud.

🌐 UniData provides high-quality datasets, content moderation, data collection and annotation for your AI/ML projects
D
Responsible Gaming Biometric System Market Research Report 2033
dataintelo.com
csv, pdf, pptx
Updated Jun 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataintelo (2025). Responsible Gaming Biometric System Market Research Report 2033 [Dataset]. https://dataintelo.com/report/responsible-gaming-biometric-system-market
Explore at:
pdf, csv, pptxAvailable download formats
Dataset updated
Jun 28, 2025
Dataset authored and provided by
Dataintelo
License
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
Time period covered
2024 - 2032
Area covered
Global
Description
Responsible Gaming Biometric System Market Outlook

According to our latest research, the global Responsible Gaming Biometric System market size reached USD 1.18 billion in 2024, with a robust year-over-year expansion driven by regulatory mandates and mounting social responsibility requirements. The market is expected to grow at a CAGR of 18.2% from 2025 to 2033, projecting a substantial increase to USD 5.26 billion by 2033. This growth is fueled by the rapid adoption of biometric technologies across both brick-and-mortar and digital gambling platforms, as well as the increasing emphasis on player protection and responsible gaming practices globally.

A key growth factor propelling the Responsible Gaming Biometric System market is the tightening of regulatory frameworks worldwide. As governments and regulatory bodies intensify their scrutiny of the gambling industry, operators are now mandated to implement robust identity verification and player monitoring systems. Biometric solutions, including facial recognition, fingerprint scanning, and voice authentication, are increasingly seen as essential tools for ensuring compliance with anti-money laundering (AML) and Know Your Customer (KYC) regulations. These systems not only help in preventing underage gambling and self-exclusion circumvention but also enable real-time monitoring of player behavior to detect signs of problem gambling, thereby fostering a safer gaming environment.

Technological advancements are another significant driver of the Responsible Gaming Biometric System market. The integration of artificial intelligence (AI) and machine learning with biometrics has enhanced the accuracy and reliability of player identification and behavioral analytics. This has enabled gaming operators to deploy sophisticated systems capable of analyzing vast datasets to identify risky behaviors, flag anomalies, and provide timely interventions. Furthermore, the proliferation of smartphones and advancements in cloud computing have made biometric solutions more accessible and scalable, allowing both large-scale casinos and online gambling platforms to implement these systems efficiently and cost-effectively.

Consumer awareness and demand for responsible gambling measures are also accelerating the adoption of biometric systems in the gaming sector. With the rising incidence of gambling addiction and its associated social costs, there is growing public pressure on operators to prioritize player welfare. Modern biometric systems offer a seamless and user-friendly approach to safeguarding players, enabling features such as self-exclusion, time and spending limits, and real-time alerts. As a result, gaming operators are increasingly investing in biometric technologies not only to comply with regulations but also to enhance their brand reputation and foster long-term customer loyalty.

Regionally, North America and Europe are at the forefront of the Responsible Gaming Biometric System market, accounting for the largest market shares due to their mature regulatory landscapes and high adoption rates of advanced technologies. The Asia Pacific region, however, is witnessing the fastest growth, driven by the expansion of the gambling industry and the increasing digitalization of gaming platforms. Countries like Australia, Singapore, and Japan are actively implementing responsible gaming measures, while emerging markets in Latin America and the Middle East & Africa are gradually following suit. This regional diversification is expected to create new opportunities for market players and further stimulate global market growth over the forecast period.

Component Analysis

The Responsible Gaming Biometric System market is segmented by component into hardware, software, and services, each playing a crucial role in the deployment and effectiveness of biometric solutions. Hardware components include biometric scanners, cameras, and sensors that capture and process unique physiological or behavioral characteristics of players. These devices form the foundation of any biometric system, ensuring accurate and real-time data collection. The hardware segment remains dominant in physical casinos and gaming venues, where robust and tamper-proof devices are essential for on-site identity verification and access control. As the market matures, hardware innovations such as contactless sensors and multi-modal biometric devices are gaining traction, enabling enhanced security and user convenience.&l
Gender Detection & Classification - Face Dataset
kaggle.com
Updated Oct 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Training Data (2023). Gender Detection & Classification - Face Dataset [Dataset]. https://www.kaggle.com/datasets/trainingdatapro/gender-detection-and-classification-image-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 31, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Training Data
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
Gender Detection & Classification - face recognition dataset

The dataset is created on the basis of Face Mask Detection dataset

Dataset Description:

The dataset comprises a collection of photos of people, organized into folders labeled "women" and "men." Each folder contains a significant number of images to facilitate training and testing of gender detection algorithms or models.

The dataset contains a variety of images capturing female and male individuals from diverse backgrounds, age groups, and ethnicities.

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2F1c4708f0b856f7889e3c0eea434fe8e2%2FFrame%2045%20(1).png?generation=1698764294000412&alt=media" alt="">

This labeled dataset can be utilized as training data for machine learning models, computer vision applications, and gender detection algorithms.

💴 For Commercial Usage: Full version of the dataset includes 376 000+ photos of people, leave a request on TrainingData to buy the dataset

Metadata for the full dataset:

assignment_id - unique identifier of the media file

worker_id - unique identifier of the person

age - age of the person

true_gender - gender of the person

country - country of the person

ethnicity - ethnicity of the person

photo_1_extension, photo_2_extension, photo_3_extension, photo_4_extension - photo extensions in the dataset

photo_1_resolution, photo_2_resolution, photo_3_extension, photo_4_resolution - photo resolution in the dataset

OTHER BIOMETRIC DATASETS:

Anti Spoofing Real Dataset

Antispoofing Replay Dataset

Selfies, ID Images dataset (5591 sets of 15 files)

Selfies and video dataset (4 052 sets)

Dataset of bald people, 5000 images

💴 Buy the Dataset: This is just an example of the data. Leave a request on https://trainingdata.pro/datasets to learn about the price and buy the dataset

Content

The dataset is split into train and test folders, each folder includes: - folders women and men - folders with images of people with the corresponding gender, - .csv file - contains information about the images and people in the dataset

File with the extension .csv

file: link to access the file,

gender: gender of a person in the photo (woman/man),

split: classification on train and test

TrainingData provides high-quality data annotation tailored to your needs

keywords: biometric system, biometric system attacks, biometric dataset, face recognition database, face recognition dataset, face detection dataset, facial analysis, gender detection, supervised learning dataset, gender classification dataset, gender recognition dataset
P
Printed Photos Attacks Dataset Dataset
paperswithcode.com
Updated Mar 23, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Printed Photos Attacks Dataset Dataset [Dataset]. https://paperswithcode.com/dataset/printed-photos-attacks-dataset
Explore at:
Dataset updated
Mar 23, 2025
Description
Description:

👉 Download the dataset here

The Printed Photos Attacks Dataset is a specialize resource design for the development and evaluation of liveness detection systems aimed at combating facial spoofing attempts. This dataset includes a comprehensive collection of videos that feature both authentic facial presentations and spoof attempts using print 2D photographs. By incorporating both real and fake faces, it provides a robust foundation for training and testing advanced facial recognition and anti-spoofing algorithms.

This dataset is particularly valuable for researchers and developers focus on enhancing biometric security systems. It introduces a novel method for learning and extracting distinctive facial features to effectively differentiate between genuine and spoofed inputs. The approach leverages deep neural networks (DNNs) and sophisticate biometric techniques, which have been shown to significantly improve the accuracy and reliability of liveness detection in various applications.

Download Dataset

Key features of the dataset include:

Diverse Presentation Methods: The dataset contains a range of facial presentations, including genuine facial videos and spoof videos create using high-quality print photographs. This diversity is essential for developing algorithms that can generalize across different types of spoofing attempts.

High-Resolution Videos: The videos in the dataset are capture in high resolution, ensuring that even subtle facial features and movements are visible, aiding in the accurate detection of spoofing.

Comprehensive Annotations: Each video is meticulously annotate with labels indicating whether the facial presentation is genuine or spoofed. Additionally, the dataset includes metadata such as the method of spoofing and environmental conditions, providing a rich context for algorithm development.

Unseen Spoof Detection: One of the unique aspects of this dataset is its emphasis on detecting unseen spoofing cues. The dataset is design to challenge algorithms to identify and adapt to new types of spoofing methods that may not have been encounter during the training phase.

Versatile Application: The dataset is suitable for a wide range of applications, including access control systems, mobile device authentication, and other security-sensitive environments where facial recognition is deploy.

This dataset is sourced from Kaggle.
A Dataset of Fingerprints for 100 Identified T2DM Patients and 100 Controls
figshare.com
bin
Updated Oct 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Farrukh Ansar (2023). A Dataset of Fingerprints for 100 Identified T2DM Patients and 100 Controls [Dataset]. http://doi.org/10.6084/m9.figshare.24233065.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.24233065.v1
Dataset updated
Oct 2, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Farrukh Ansar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset comprises fingerprint data collected from a total of 200 individuals, including 100 identified Type 2 Diabetes Mellitus (T2DM) patients and 100 control subjects. The dataset contains detailed fingerprint information, which can serve as a valuable resource for research and analysis in the context of T2DM and its potential associations with fingerprint characteristics. Researchers and analysts can use this dataset to explore patterns, conduct studies, and derive insights related to T2DM identification and its correlation with fingerprint features
KeyRecs: Keystroke Dynamics Dataset
zenodo.org
data.niaid.nih.gov
csv
Updated Mar 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tiago Dias; Tiago Dias; João Vitorino; João Vitorino; Eva Maia; Eva Maia; Orlando Sousa; Orlando Sousa; Isabel Praça; Isabel Praça (2024). KeyRecs: Keystroke Dynamics Dataset [Dataset]. http://doi.org/10.5281/zenodo.7886743
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7886743
Dataset updated
Mar 1, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Tiago Dias; Tiago Dias; João Vitorino; João Vitorino; Eva Maia; Eva Maia; Orlando Sousa; Orlando Sousa; Isabel Praça; Isabel Praça
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
KeyRecs is a keystroke dynamics dataset that can be used to train, validate, and test machine learning models for anomaly detection and robust typing pattern recognition, as well as the clustering and classification of users that present a similar behavior. It contains fixed-text and free-text samples of user typing behavior, obtained in a study with 100 participants of 20 different nationalities performing password retype and transcription exercises.

The samples consist of inter-key latencies computed by measuring the time between each key press and release during an exercise, following a digraph model. Additionally, the participants were also asked to provide their demographic information regarding age, gender, handedness, and nationality.

KeyRecs can be valuable to enhance the recognition of authorized users and prevent illegal logins in biometric authentication software, and can be combined with additional data recordings to create more extensive datasets and improve the generalization of machine learning models.

If you use this dataset, please cite the primary data article: https://doi.org/10.1016/j.dib.2023.109509
Stichting Calidris - Longterm biometric data of ringed shorebirds
gbif.org
demo.gbif.org
Updated Aug 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jim de Fouw; Jim de Fouw (2024). Stichting Calidris - Longterm biometric data of ringed shorebirds [Dataset]. http://doi.org/10.15468/h55bdb
Explore at:
Unique identifier
https://doi.org/10.15468/h55bdb
Dataset updated
Aug 30, 2024
Dataset provided by
Global Biodiversity Information Facilityhttps://www.gbif.org/
Stichting Calidris
Authors
Jim de Fouw; Jim de Fouw
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Time period covered
Jan 1, 1968 - Jan 1, 2024
Area covered

Description
This dataset contains records of shorebirds that are captured, ringed and measured with mistnets on the mudflats in the Dutch Wadden Sea at Schiermonnikoog. Dataset includes the sampling event and the record of the caught shorebird species. Most commonly used biometric measures of wild birds will be add in the future and contain: Wing length, Mass, Bill Length, Total Head Length, Tarsus length, Primary score, body moult index and plumage cover.
e
Biometric and Intelligent Self-Assessment of Student Progress system -...
b2find.eudat.eu
Updated Oct 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). Biometric and Intelligent Self-Assessment of Student Progress system - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/138feca1-986c-5fbe-988c-79e2df855e11
Explore at:
Dataset updated
Oct 21, 2023
Description
All distance learning participants (students, professors, instructors, mentors, tutors and the rest) would like to know how well the students have assimilated the study materials being taught. The analysis and assessment of the knowledge students have acquired over a semester are an integral part of the independent studies process at the most advanced universities worldwide. A formal test or exam during the semester would cause needless stress for students. To resolve this problem, the authors of this article have developed a Biometric and Intelligent Self-Assessment of Student Progress (BISASP) System. The obtained research results are comparable with the results from other similar studies. This article ends with two case studies to demonstrate practical operation of the BISASP System. The first case study analyses the interdependencies between microtremors, stress and student marks. The second case study compares the marks assigned to students during the e-self-assessment, prior to the e-test and during the e-test. The dependence, determined in the second case study, between the student marks scored for the real examination and the marks based on their self-evaluation is statistically significant (the significance >0.99%). The original contribution of this article, compared to the research results published earlier, is as follows: the BISASP System developed by the authors is superior to the traditional self-assessment systems due to the use of voice stress analysis and a special algorithm, which permits a more detailed analysis of the knowledge attained by a student.
Anti Spoofing Selfie Live Dataset - 5,000+ files
kaggle.com
Updated May 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Axon Labs (2024). Anti Spoofing Selfie Live Dataset - 5,000+ files [Dataset]. https://www.kaggle.com/datasets/axondata/anti-spoofing-live-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 30, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Axon Labs
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Anti Spoofing Selfie Live dataset - Selfie collection

What is inside this dataset?

Biometric Attack dataset consists of >5k selfie images of people from >50 countries. Each participant provided 1 real life selfy image. Live selfies help facial recognition models to identify real faces and detect spoofing attempts, decreasing false negative results for Liveness detection tests.

Dataset parameters:

Key nationalities are covered (Caucasians, Black, Asian, Hispanic etc)

Variety of lightning conditions and capturing devices

Different demographic parameters (broad range of Age, balanced gender and race distribution)

Full version of dataset is available for commercial usage - leave a request on our website Axonlabs to purchase the dataset 💰

How Live selfie dataset helps Liveness models?

Selfies provide a diverse range of facial features, lighting conditions, and capturing devices, which are essential for training robust facial recognition models that can accurately distinguish between real and spoofed faces

Potential Use Cases:

Liveness detection: This dataset is ideal for training and evaluating liveness detection models, enabling researchers to distinguish between real and spoof data with high accuracy

Keywords: Real life data, Live data, Selfie data, Antispoofing for AI, Liveness Detection dataset for AI, Spoof Detection dataset, Facial Recognition dataset, Biometric Authentication dataset, AI Dataset, Anti-Spoofing Technology, Facial Biometrics, Machine Learning Dataset, Deep Learning
Custom Silicone Mask Attack Dataset (CSMAD)
zenodo.org
Updated Mar 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sushil Bhattacharjee; Sushil Bhattacharjee; Amir Mohammadi; Sébastien Marcel; Sébastien Marcel; Amir Mohammadi (2023). Custom Silicone Mask Attack Dataset (CSMAD) [Dataset]. http://doi.org/10.34777/1aer-9584
Explore at:
Unique identifier
https://doi.org/10.34777/1aer-9584
Dataset updated
Mar 8, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Sushil Bhattacharjee; Sushil Bhattacharjee; Amir Mohammadi; Sébastien Marcel; Sébastien Marcel; Amir Mohammadi
Description
The Custom Silicone Mask Attack Dataset (CSMAD) contains presentation attacks made of six custom-made silicone masks. Each mask cost about USD 4000. The dataset is designed for face presentation attack detection experiments.

The Custom Silicone Mask Attack Dataset (CSMAD) has been collected at the Idiap Research Institute. It is intended for face presentation attack detection experiments, where the presentation attacks have been mounted using a custom-made silicone mask of the person (or identity) being attacked.

The dataset contains videos of face-presentations, as a set of files specifying the experimental protocol corresponding the experiments presented in the corresponding publication.

Reference

If you publish results using this dataset, please cite the following publication.

Sushil Bhattacharjee, Amir Mohammadi and Sebastien Marcel: "Spoofing Deep Face Recognition With Custom Silicone Masks." in Proceedings of International Conference on Biometrics: Theory, Applications, and Systems (BTAS), 2018.
10.1109/BTAS.2018.8698550
http://publications.idiap.ch/index.php/publications/show/3887

Data Collection

Face-biometric data has been collected from 14 subjects to create this dataset. Subjects participating in this data-collection have played three roles: targets, attackers, and bona-fide clients. The subjects represented in the dataset are referred to here with letter-codes: A .. N. The subjects A..F have also been targets. That is, face-data for these six subjects has been used to construct their corresponding flexible masks (made of silicone). These masks have been made by Nimba Creations Ltd., a special effects company.

Bona fide presentations have been recorded for all subjects A..N. Attack presentations (presentations where the subject wears one of 6 masks) have been recorded for all six targets, made by different subjects. That is, each target has been attacked several times, each time by a different attacker wearing the mask in question. This is one way of increasing the variability in the dataset. Another way we have augmented the variability of the dataset is by capturing presentations under different illumination conditions. Presentations have been captured in four different lighting conditions:

flourescent ceiling light only

halogen lamp illuminating from the left of the subject only

halogen lamp illuminating from the right only

both halogen lamps illuminating from both sides simultaneously

All presentations have been captured with a green uniform background. See the paper mentioned above for more details of the data-collection process.

Dataset Structure

The dataset is organized in three subdirectories: ‘attack’, ‘bonafide’, ‘protocols’. The two directories: ‘attack’ and ‘bonafide’ contain presentation-videos and still images for attacks and bona fide presentations, respectively. The folder ‘protocols’ contains text files specifying the experimental protocol for vulnerability analysis of face-recognition (FR) systems.

The number of data-files per category are as follows:

‘bonafide’: 87 videos, and 17 still images (in .JPG format). The still images are frontal face images captured using a Nikon Coolpix digital camera.

‘attack’: 159, organized in two sub-folders – ‘WEAR’ (108 videos), and ‘STAND’ (51 videos)

The folder ‘attack/WEAR’ contains videos where the attack has been made by a person (attacker) wearing the mask of the target being attacked. The ‘attack/STAND’ folder contains videos where the attack has been made using a the target’s mask mounted on an appropriate stand.

Video File Format

The video files for the face-presentations are in ‘hdf5’ format (with file-extensions ‘.h5’. The folder structure of the hdf5 file is shown in Figure 1. Each file contains data collected using two cameras:

RealSense SR300 (from Intel): collects images/videos in visible-light (RGB color) , near infrared (NIR) @ 860nm wavelength, and depth maps

Compact Pro (from Seek Thermal): collects thermal (long-wave infrared (LWIR)) images.

As shown in Figure 1, frames from the different channels (color, infrared, depth, thermal) from he two cameras are stored in separate directory-hierarchies in the hdf5 file. Each file respresents a video of approximately 10 seconds, or roughly, 300 frames.

In the hdf5 file, the directory for SR300 also contains a subdirectory named ‘aligned_color_to_depth’. This folder contains post-processed data, where the frames of depth channel have been aligned with those of the color channel based on the time-stamps of the frames.

Experimental Protocol

The ‘protocols’ folder contains text files that specify the protocols for vulnerability analysis experiments reported in the paper mentioned above. Please see the README file in the protocols folder for details.
F
Native American Multi-Year Facial Image Dataset
futurebeeai.com
wav
Updated Aug 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Native American Multi-Year Facial Image Dataset [Dataset]. https://www.futurebeeai.com/dataset/image-dataset/facial-images-historical-native-american
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Area covered
United States
Dataset funded by
FutureBeeAI
Description
Introduction
Welcome to the Native American Multi-Year Facial Image Dataset, thoughtfully curated to support the development of advanced facial recognition systems, biometric identification models, KYC verification tools, and other computer vision applications. This dataset is ideal for training AI models to recognize individuals over time, track facial changes, and enhance age progression capabilities.
Facial Image Data
This dataset includes over 5,000+ high-quality facial images, organized into individual participant sets, each containing:
•
Historical Images: 22 facial images per participant captured across a span of 10 years

•
Enrollment Image: One recent high-resolution facial image for reference or ground truth

Diversity & Representation
•
Geographic Coverage: Participants from USA, Canada, Mexico and more and other Native American regions

•
Demographics: Individuals aged 18 to 70 years, with a gender distribution of 60% male and 40% female

•
File Formats: All images are available in JPEG and HEIC formats

Image Quality & Capture Conditions
To ensure model generalization and practical usability, images in this dataset reflect real-world diversity:
•
Lighting Conditions: Images captured under various natural and artificial lighting setups

•
Backgrounds: A wide range of indoor and outdoor backgrounds

•
Device Quality: Captured using modern, high-resolution mobile devices for consistency and clarity

Metadata
Each participant’s dataset is accompanied by rich metadata to support advanced model training and analysis, including:
•Unique participant ID
•File name
•Age at the time of image capture
•Gender
•Country of origin
•Demographic profile
•File format
Use Cases & Applications
This dataset is highly valuable for a wide range of AI and computer vision applications:
•
Facial Recognition Systems: Train models for high-accuracy face matching across time

•
KYC & Identity Verification: Improve time-spanning verification for banks, insurance, and government services

•
Biometric Security Solutions: Build reliable identity authentication models

•
Age Progression & Estimation Models: Train AI to predict aging patterns or estimate age from facial features

•
Generative AI: Support creation and validation of synthetic age progression or longitudinal face generation

Secure & Ethical Collection
•
Platform: All data was securely collected and processed through FutureBeeAI’s proprietary systems

•
Ethical Compliance: Full participant consent obtained with transparent communication of use cases

•
Privacy-Protected: No personally identifiable information is included; all data is anonymized and handled with care

Dataset Updates & Customization
To keep pace with evolving AI needs, this dataset is regularly updated and customizable. Custom data collection options include:
<div style="margin-top:10px; margin-bottom: 10px; padding-left: 30px; display: flex; gap:
Z
Metaverse Gait Authentication Dataset (MGAD)
data.niaid.nih.gov
Updated Feb 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
RAVIKANTI, SANDEEP (2025). Metaverse Gait Authentication Dataset (MGAD) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_14847772
Explore at:
Dataset updated
Feb 11, 2025
Dataset authored and provided by
RAVIKANTI, SANDEEP
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains gait-based biometric data collected from 5,000 users in a simulated environment for gait authentication in the Metaverse. It includes 16 key gait features extracted using OpenPose and MediaPipe and processed with feature engineering techniques for improved usability.

The dataset is valuable for gait-based authentication, user identification, and biometric security applications. It can be used for machine learning models, deep learning, and anomaly detection in gait recognition research.

Features include:

Stride length, step frequency, stance phase duration, swing phase duration

Hip, knee, and ankle joint angles

Ground reaction forces (GRFs), cadence variability, foot clearance

Gait symmetry index and more

Format: CSVLicense: CC BY 4.0 (Attribution Required)Citation: If using this dataset, please cite:Sandeep Ravikanti (2024). "Metaverse Gait Authentication Dataset (MGAD)." Zenodo. DOI: [10.5281/zenodo.14847773]
NIST Fingerprint Registration and Comparison Tool (NFRaCT)
catalog.data.gov
data.nist.gov
Updated Mar 14, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institute of Standards and Technology (2025). NIST Fingerprint Registration and Comparison Tool (NFRaCT) [Dataset]. https://catalog.data.gov/dataset/nist-fingerprint-registration-and-comparison-tool-nfract-1c713
Explore at:
Dataset updated
Mar 14, 2025
Dataset provided by
National Institute of Standards and Technologyhttp://www.nist.gov/
Description
The NIST Fingerprint Registration and Comparison Tool (NFRaCT) is a cross-platform GUI application which allows a user to load a pair of fingerprint images, find corresponding points in both images, register and crop the images, and finally compute a series of measurements on the registered images as described in NIST Special Publication 500-336

Facebook

Twitter

Click to copy link

Link copied

Cite

Kamal Abuqaaud; Ali Bou Nassif; Ismail Shahin; Kamal Abuqaaud; Ali Bou Nassif; Ismail Shahin (2025). FaciaVox a Multimodal Biometric Dataset [Dataset]. http://doi.org/10.5281/zenodo.14861092

FaciaVox a Multimodal Biometric Dataset

Explore at:

Unique identifier

https://doi.org/10.5281/zenodo.14861092

Dataset updated

Feb 13, 2025

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Kamal Abuqaaud; Ali Bou Nassif; Ismail Shahin; Kamal Abuqaaud; Ali Bou Nassif; Ismail Shahin

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The FaciaVox dataset is an extensive multimodal biometric resource designed to enable in-depth exploration of face-image and voice recording research areas in both masked and unmasked scenarios.

Features of the Dataset:

1. Multimodal Data: A total of 1,800 face images (JPG) and 6,000 audio recordings (WAV) were collected, enabling cross-domain analysis of visual and auditory biometrics.

2. Participants were categorized into four age groups for structured labeling:
Label 1: Under 16 years
Label 2: 16 to less than 31 years
Label 3: 31 to less than 46 years
Label 4: 46 years and above

3. Sibling Data: Some participants are siblings, adding a challenging layer for speaker identification and facial recognition tasks due to genetic similarities in vocal and facial features. Sibling relationships are documented in the accompanying "FaciaVox List" data file.

4. Standardized Filenames: The dataset uses a consistent, intuitive naming convention for both facial images and voice recordings. Each filename includes:
Type (F: Face Image, V: Voice Recording)
Participant ID (e.g., sub001)
Mask Type (e.g., a: unmasked, b: disposable mask, etc.)
Zoom Level or Sentence ID (e.g., 1x, 3x, 5x for images or specific sentence identifier {01, 02, 03, ..., 10} for recordings)

5. Diverse Demographics: 19 different countries.

6. A challenging face recognition problem involving reflective mask shields and severe lighting conditions.

7. Each participant uttered 7 English statements and 3 Arabic statements, regardless of their native language. This adds a challenge for speaker identification.

Research Applications

FaciaVox is a versatile dataset supporting a wide range of research domains, including but not limited to:
• Speaker Identification (SI) and Face Recognition (FR): Evaluating biometric systems under varying conditions.
• Impact of Masks on Biometrics: Investigating how different facial coverings affect recognition performance.
• Language Impact on SI: Exploring the effects of native and non-native speech on speaker identification.
• Age and Gender Estimation: Inferring demographic information from voice and facial features.
• Race and Ethnicity Matching: Studying biometrics across diverse populations.
• Synthetic Voice and Deepfake Detection: Detecting cloned or generated speech.
• Cross-Domain Biometric Fusion: Combining facial and vocal data for robust authentication.
• Speech Intelligibility: Assessing how masks influence speech clarity.
• Image Inpainting: Reconstructing occluded facial regions for improved recognition.

Researchers can use the facial images and voice recordings independently or in combination to explore multimodal biometric systems. The standardized filenames and accompanying metadata make it easy to align visual and auditory data for cross-domain analyses. Sibling relationships and demographic labels add depth for tasks such as familial voice recognition, demographic profiling, and model bias evaluation.

Clear search

Close search

Google apps

Main menu

FaciaVox a Multimodal Biometric Dataset

Heartprint: A Multisession ECG Dataset for Biometric Recognition

Gait Biometrics Market Report | Global Forecast From 2025 To 2033

Gait Biometrics Market Outlook

Component Analysis

Report Scope

Behaviour Biometrics Dataset

BED: Biometric EEG dataset

Biometrics in Transportation Market Report | Global Forecast From 2025 To...

Biometrics in Transportation Market Outlook

Technology Analysis

BIOMETRIC-DATASET

Anti-Spoofing Real Videos Dataset

Face Antispoofing dataset for recognition systems

Examples of data

💵 Buy the Dataset: This is a limited preview of the data. To access the full dataset, please contact us at https://unidata.pro to discuss your requirements and pricing options.

Metadata for the dataset

🌐 UniData provides high-quality datasets, content moderation, data collection and annotation for your AI/ML projects

Responsible Gaming Biometric System Market Research Report 2033

Responsible Gaming Biometric System Market Outlook

Component Analysis

Gender Detection & Classification - Face Dataset

Gender Detection & Classification - face recognition dataset

The dataset is created on the basis of Face Mask Detection dataset

💴 For Commercial Usage: Full version of the dataset includes 376 000+ photos of people, leave a request on TrainingData to buy the dataset

Metadata for the full dataset:

OTHER BIOMETRIC DATASETS:

💴 Buy the Dataset: This is just an example of the data. Leave a request on https://trainingdata.pro/datasets to learn about the price and buy the dataset

Content

File with the extension .csv

TrainingData provides high-quality data annotation tailored to your needs

Printed Photos Attacks Dataset Dataset

A Dataset of Fingerprints for 100 Identified T2DM Patients and 100 Controls

KeyRecs: Keystroke Dynamics Dataset

Stichting Calidris - Longterm biometric data of ringed shorebirds

Biometric and Intelligent Self-Assessment of Student Progress system -...

Anti Spoofing Selfie Live Dataset - 5,000+ files

Anti Spoofing Selfie Live dataset - Selfie collection

What is inside this dataset?

Dataset parameters:

Full version of dataset is available for commercial usage - leave a request on our website Axonlabs to purchase the dataset 💰

How Live selfie dataset helps Liveness models?

Potential Use Cases:

Custom Silicone Mask Attack Dataset (CSMAD)

Native American Multi-Year Facial Image Dataset

Introduction

Facial Image Data

Diversity & Representation

Image Quality & Capture Conditions

Metadata

Use Cases & Applications

Secure & Ethical Collection

Dataset Updates & Customization

Metaverse Gait Authentication Dataset (MGAD)

NIST Fingerprint Registration and Comparison Tool (NFRaCT)

FaciaVox a Multimodal Biometric Dataset