According to a survey conducted in March 2025 in the United States, ** percent of respondents aged 65 and older in the United States reported being very concerned about the spread of video and audio deep fakes generated via artificial intelligence (AI), compared to ** percent of those aged 18 to 29 years. Overall, the majority of U.S. citizens across all age groups expressed worries about AI-generated deep fakes.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The SDFVD 2.0 is an augmented extension of the original SDFVD dataset, which originally contained 53 real and 53 fake videos. This new version has been created to enhance the diversity and robustness of the dataset by applying various augmentation techniques like horizontal flip, rotation, shear, brightness and contrast adjustment, additive gaussian noise, downscaling and upscaling to the original videos. These augmentations help simulate a wider range of conditions and variations, making the dataset more suitable for training and evaluating deep learning models for deepfake detection. This process has significantly expanded the dataset resulting in 461 real and 461 forged videos, providing a richer and more varied collection of video data for deepfake detection research and development. Dataset Structure The dataset is organized into two main directories: real and fake, each containing the original and augmented videos. Each augmented video file is named following the pattern: ‘
Artificial intelligence-generated deepfakes are videos or photos that can be used to depict someone speaking or doing something that they did not actually say or do. Deepfakes are being used more frequently in cybercrime. A global 2022 survey found that 71 percent of consumers claimed they did not know what a deepfake video was, while 29 percent claimed to be familiar with the term.
https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html
This data set contains the original training data and the processed result of the fake satellite images. Also, the codes to detect fake satellite images are also attached.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ECG Dataset
This repository contains an small version of the ECG dataset: https://huggingface.co/datasets/deepsynthbody/deepfake_ecg, split into training, validation, and test sets. The dataset is provided as CSV files and corresponding ECG data files in .asc format. The ECG data files are organized into separate folders for the train, validation, and test sets.
Folder Structure
. ├── train.csv ├── validate.csv ├── test.csv ├── train │ ├── file_1.asc │ ├── file_2.asc… See the full description on the dataset page: https://huggingface.co/datasets/deepsynthbody/deepfake-ecg-small.
https://market.us/privacy-policy/https://market.us/privacy-policy/
Deepfake Detection Market is estimated to reach USD 5,609.3 Million By 2034, Riding on a Strong 47.6% CAGR throughout the forecast period.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
The main purpose of this data set is to facilitate research into audio DeepFakes.
These generated media files have been increasingly used to commit impersonation attempts or online harassment.
The data set consists of 88,600 generated audio clips (16-bit PCM wav).
All of these samples were generated by four different neural network architectures:
Additionally, we examined a bigger version of MelGAN and investigated a variant of Multi-Band MelGAN that computes its auxiliary loss over the full audio instead of its subbands.
Collection Process
For WaveGlow, we utilize the official implementation (commit 8afb643) in conjunction with the official pre-trained network on PyTorch Hub.
We use a popular implementation available on GitHub (commit 12c677e) for the remaining networks.
The repository also offers pre-trained models.
We used the pre-trained networks to generate samples that are similar to their respective training distributions, LJ Speech and JSUT.
When sampling the data set, we first extract Mel spectrograms from the original audio files, using the pre-processing scripts of the corresponding repositories.
We then feed these Mel spectrograms to the respective models to obtain the data set.
This data set is licensed with a CC-BY-SA 4.0 license.
This work was supported by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany's Excellence Strategy -- EXC-2092 CaSa -- 390781972.
This dataset includes all detectable faces of the corresponding part of the full dataset. Kaggle and the host expected and encouraged us to train our models outside of Kaggle’s notebooks environment; however, for someone who prefers to stick to Kaggle's kernels, this dataset would help a lot 😄.
Can be used for a variety purpose, e.g. classification, etc.
Want something to start? Let check this demo 😉.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Interview protocols, recordings, and transcripts of three focus groups to investigate the social perception of AI and deepfake technology at the Massachusetts Institute of Technology. The focus groups are described below:
Focus Group #1 (engaged public): 12 participants in a 3-session Make A Fake class; the students were offered a full course refund in return for their participation in the study, which took place immediately following the final session of the class on Monday 27 February, 2023.
Focus Group #2 (attentive public) 14 visitors to the MIT Museum who volunteered to participate in the discussion after being recruited in the museum itself. The activity was scheduled for the week following recruitment, Monday 24 April, 2023, and as compensation for their involvement participants were offered a refund of their museum admission fee, and two more tickets for another day.
Focus Group #3 (nonattentive public): 13 pedestrians who were recruited with the help of 4 MIT volunteers working in the immediate environs of the Boston Public Library and the adjacent Prudential Center Shopping Mall. Participants were offered a $70 Amazon Gift Card in consideration for one hour of conversation on the same day of their recruitment, Saturday 27 May, 2023.
According to a survey conducted in August 2023, six in 10 adults in the United States said they were concerned about the spread of misleading artificial intelligence (AI) video and audio deep fakes. In contrast, only two percent of U.S. citizens reported not being concerned at all about the AI-generated deep fakes.
Human face Deepfake dataset sampled from large datasets
High Quality Dataset Diverse Dataset Challenging Dataset Large Dataset Text prompts
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
VeriChain Deepfake Detection Dataset
Dataset Description
This repository hosts the dataset for the VeriChain project, specifically curated for classifying images into three distinct categories: Real, AI-Generated, and Deepfake. The data is intended for training and evaluating robust models capable of identifying manipulated or synthetic media. This dataset was sourced and processed from the original AI-vs-Deepfake-vs-Real dataset.
Dataset Structure
The data… See the full description on the dataset page: https://huggingface.co/datasets/einrafh/verichain-deepfake-data.
According to our latest research, the global Deepfake Detection Accelerator market size in 2024 is valued at USD 1.23 billion, reflecting a robust response to the growing threat of synthetic media and manipulated content. The market is expected to expand at a remarkable CAGR of 28.7% from 2025 to 2033, reaching a forecasted value of USD 10.18 billion by 2033. This substantial growth is driven by increasing awareness of the risks associated with deepfakes, rapid advancements in artificial intelligence, and a surge in demand for real-time content authentication across diverse sectors. As per our latest research, the proliferation of deepfake technologies and the resulting security and reputational risks are compelling organizations and governments to invest significantly in detection accelerators, thereby propelling market expansion.
One of the primary growth factors for the Deepfake Detection Accelerator market is the exponential increase in the creation and dissemination of deepfake content across digital platforms. As deepfakes become more sophisticated and accessible, businesses, media outlets, and public institutions are recognizing the urgent need for robust detection solutions. The proliferation of social media, coupled with the ease of sharing multimedia content, has heightened the risk of misinformation, identity theft, and reputational damage. This has led to a surge in investments in advanced deepfake detection technologies, particularly accelerators that can process and analyze vast volumes of data in real time. The growing public awareness about the potential societal and economic impacts of deepfakes is further fueling the adoption of these solutions.
Another significant driver is the rapid evolution of artificial intelligence and machine learning algorithms, which are the backbone of deepfake detection accelerators. The ability to leverage AI-powered hardware and software for identifying manipulated content has substantially improved detection accuracy and speed. Enterprises and governments are increasingly relying on these accelerators to safeguard sensitive information, ensure content authenticity, and maintain compliance with emerging regulations. The integration of deepfake detection accelerators into existing cybersecurity frameworks is becoming a standard practice, especially in sectors such as finance, healthcare, and government, where data integrity is paramount. This technological synergy is expected to sustain the market’s momentum throughout the forecast period.
The regulatory landscape is also playing a critical role in shaping the growth trajectory of the Deepfake Detection Accelerator market. Governments across major economies are enacting stringent policies and guidelines to combat the spread of malicious synthetic content. These regulations mandate organizations to implement advanced detection mechanisms, thereby driving the demand for high-performance accelerators. Furthermore, industry collaborations and public-private partnerships are fostering innovation in the development of scalable and interoperable deepfake detection solutions. The increasing frequency of high-profile deepfake incidents is prompting regulatory bodies to accelerate the adoption of these technologies, ensuring market growth remains on an upward trajectory.
From a regional perspective, North America currently leads the global deepfake detection accelerator market, accounting for the largest share in 2024. This dominance can be attributed to the presence of key technology providers, a mature cybersecurity ecosystem, and proactive regulatory initiatives. Europe follows closely, driven by strict data protection laws and increased investments in AI research. The Asia Pacific region is emerging as a high-growth market, fueled by the rapid digital transformation of its economies and rising concerns about deepfake-related cyber threats. Latin America and the Middle East & Africa are also witnessing increased adoption, albeit at a slower pace, as awareness and infrastructure development continue to progress. Overall, the global market is poised for sustained growth, with regional dynamics playing a pivotal role in shaping future trends.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset for the article: dramatic deepfake tales of the world: Analogical reasoning, AI-generated political (mis-)infotainment, and the distortion of global affairs
In 2024, Poles learned false information generated by artificial intelligence most often through facial features and, above all, movement inconsistent with spoken words.
The detection and localization of highly realistic deepfake audio-visual content are challenging even for the most advanced state-of-the-art methods. While most of the research efforts in this domain are focused on detecting high-quality deepfake images and videos, only a few works address the problem of the localization of small segments of audio-visual manipulations embedded in real videos. In this research, we emulate the process of such content generation and propose the AV-Deepfake1M dataset. The dataset contains content-driven (i) video manipulations, (ii) audio manipulations, and (iii) audio-visual manipulations for more than 2K subjects resulting in a total of more than 1M videos. The paper provides a thorough description of the proposed data generation pipeline accompanied by a rigorous analysis of the quality of the generated data. The comprehensive benchmark of the proposed dataset utilizing state-of-the-art deepfake detection and localization methods indicates a significant drop in performance compared to previous datasets. The proposed dataset will play a vital role in building the next-generation deepfake localization methods. The dataset and associated code are available at https://github.com/ControlNet/AV-Deepfake1M.
https://www.polarismarketresearch.com/privacy-policyhttps://www.polarismarketresearch.com/privacy-policy
The deepfake AI market size was valued at USD 794.55 million in 2024 and is estimated to grow at a CAGR of 41.5% from 2025–2034.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
With the rapid development of deep learning techniques, the generation and counterfeiting of multimedia material are becoming increasingly straightforward to perform. At the same time, sharing fake content on the web has become so simple that malicious users can create unpleasant situations with minimal effort. Also, forged media are getting more and more complex, with manipulated videos (e.g., deepfakes where both the visual and audio contents can be counterfeited) that are taking the scene over still images. The multimedia forensic community has addressed the possible threats that this situation could imply by developing detectors that verify the authenticity of multimedia objects. However, the vast majority of these tools only analyze one modality at a time. This was not a problem as long as still images were considered the most widely edited media, but now, since manipulated videos are becoming customary, performing monomodal analyses could be reductive. Nonetheless, there is a lack in the literature regarding multimodal detectors (systems that consider both audio and video components). This is due to the difficulty of developing them but also to the scarsity of datasets containing forged multimodal data to train and test the designed algorithms.
In this paper we focus on the generation of an audio-visual deepfake dataset. First, we present a general pipeline for synthesizing speech deepfake content from a given real or fake video, facilitating the creation of counterfeit multimodal material. The proposed method uses Text-to-Speech (TTS) and Dynamic Time Warping (DTW) techniques to achieve realistic speech tracks. Then, we use the pipeline to generate and release TIMIT-TTS, a synthetic speech dataset containing the most cutting-edge methods in the TTS field. This can be used as a standalone audio dataset, or combined with DeepfakeTIMIT and VidTIMIT video datasets to perform multimodal research. Finally, we present numerous experiments to benchmark the proposed dataset in both monomodal (i.e., audio) and multimodal (i.e., audio and video) conditions. This highlights the need for multimodal forensic detectors and more multimodal deepfake data.
For the initial version of TIMIT-TTS v1.0
Arxiv: https://arxiv.org/abs/2209.08000
TIMIT-TTS Database v1.0: https://zenodo.org/record/6560159
ELSA - Multimedia use case
ELSA Multimedia is a large collection of Deep Fake images, generated using diffusion models
Dataset Summary
This dataset was developed as part of the EU project ELSA. Specifically for the Multimedia use-case. Official webpage: https://benchmarks.elsa-ai.eu/ This dataset aims to develop effective solutions for detecting and mitigating the spread of deep fake images in multimedia content. Deep fake images, which are highly realistic and deceptive… See the full description on the dataset page: https://huggingface.co/datasets/elsaEU/ELSA_D3.
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
According to our latest research, the global edge-based robot deepfake detector market size reached USD 1.37 billion in 2024, driven by the escalating need for advanced security and authentication methods in robotics and automation. The market is exhibiting robust growth with a compound annual growth rate (CAGR) of 22.4% from 2025 to 2033. By the end of the forecast period, the market is projected to attain a value of USD 10.48 billion in 2033. This remarkable expansion is attributed to the increasing sophistication of deepfake technologies, which has necessitated the deployment of real-time, edge-based detection solutions across diverse sectors such as security, industrial automation, healthcare, and consumer electronics.
A primary growth driver for the edge-based robot deepfake detector market is the rapid proliferation of deepfake content and the corresponding surge in security threats. As deepfake algorithms become more sophisticated, both public and private organizations are compelled to invest in advanced detection solutions that can operate in real time on the edge. Robots and automated systems, especially those deployed in sensitive environments like government installations, critical infrastructure, and healthcare, are increasingly vulnerable to malicious deepfake attacks. The integration of edge-based detection ensures that these systems can autonomously identify and neutralize threats without relying on centralized cloud processing, thereby reducing latency and enhancing operational security. Growing awareness about the potential risks posed by deepfakes, coupled with regulatory mandates for robust security frameworks, is further accelerating the adoption of edge-based deepfake detectors in robotics.
Another significant factor fueling market growth is the technological advancement in artificial intelligence (AI) and machine learning (ML) algorithms tailored for edge computing environments. The development of lightweight, yet highly accurate, deepfake detection models that can be embedded directly into robotic hardware has revolutionized the market landscape. These innovations enable real-time data analysis and threat identification without the need for continuous connectivity or extensive cloud resources, making them ideal for deployment in remote or bandwidth-constrained settings. The synergy between AI-driven detection and edge hardware is also fostering the emergence of new applications within industrial automation, automotive, and consumer electronics, where robots are expected to operate autonomously and securely in dynamic environments.
The expanding adoption of edge-based robot deepfake detectors is also being propelled by the increasing demand for privacy-preserving solutions. In sectors like healthcare and finance, where sensitive data is processed by robotic systems, ensuring data privacy and compliance with regulations such as GDPR and HIPAA is paramount. Edge-based solutions minimize the transmission of raw data to external servers, enabling organizations to maintain tighter control over their information assets. Additionally, the growing trend of Industry 4.0 and the Internet of Things (IoT) has amplified the deployment of interconnected robotic systems, further emphasizing the need for decentralized, edge-native security mechanisms. These trends are expected to sustain the momentum of the market throughout the forecast period.
From a regional perspective, North America currently dominates the edge-based robot deepfake detector market, accounting for the largest revenue share in 2024. The region’s leadership is underpinned by the presence of major technology firms, a robust innovation ecosystem, and early adoption of AI-based security solutions across industries. However, Asia Pacific is anticipated to witness the fastest growth over the coming years, driven by rapid industrialization, increasing investments in automation, and heightened awareness of cybersecurity threats. Europe, Latin America, and the Middle East & Africa are also experiencing steady growth, supported by regulatory initiatives and growing digital transformation efforts. The global landscape is thus characterized by a dynamic interplay of technological innovation, regulatory imperatives, and evolving threat vectors.
The component segment of the edge-based robot deepfake detector market is divided into hardware, software, and services. Hardware forms the backbone of edge
According to a survey conducted in March 2025 in the United States, ** percent of respondents aged 65 and older in the United States reported being very concerned about the spread of video and audio deep fakes generated via artificial intelligence (AI), compared to ** percent of those aged 18 to 29 years. Overall, the majority of U.S. citizens across all age groups expressed worries about AI-generated deep fakes.