12 datasets found

P
DFDC Dataset
paperswithcode.com
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Brian Dolhansky; Russ Howes; Ben Pflaum; Nicole Baram; Cristian Canton Ferrer, DFDC Dataset [Dataset]. https://paperswithcode.com/dataset/dfdc
Explore at:
Authors
Brian Dolhansky; Russ Howes; Ben Pflaum; Nicole Baram; Cristian Canton Ferrer
Description
The DFDC (Deepfake Detection Challenge) is a dataset for deepface detection consisting of more than 100,000 videos.

The DFDC dataset consists of two versions:

Preview dataset. with 5k videos. Featuring two facial modification algorithms. Full dataset, with 124k videos. Featuring eight facial modification algorithms
Open Media Forensics Challenge (OpenMFC) Evaluation Datasets
data.nist.gov
catalog.data.gov
Updated Mar 4, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Haiying Guan (2022). Open Media Forensics Challenge (OpenMFC) Evaluation Datasets [Dataset]. http://doi.org/10.18434/mds2-2410
Explore at:
Unique identifier
https://doi.org/10.18434/mds2-2410, https://identifiers.org/ark:/88434/mds2-2410
Dataset updated
Mar 4, 2022
Dataset provided by
National Institute of Standards and Technologyhttp://www.nist.gov/
Authors
Haiying Guan
License
https://www.nist.gov/open/licensehttps://www.nist.gov/open/license
Description
The datasets contain the following parts for Open Media Forensics Challenge (OpenMFC) evaluations: 1. NC16 Kickoff dataset 2. NC17 development and evaluation datasets 3. MFC18 development and evaluation datasets 4. MFC19 development and evaluation datasets 5. MFC20 development and evaluation datasets 6. OpenMFC2022 steg datasets 7. OpenMFC2022 deepfake datasets
Deepfake Detection - Faces - Sample
kaggle.com
zip
Updated Feb 4, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hieu Phung (2020). Deepfake Detection - Faces - Sample [Dataset]. https://www.kaggle.com/phunghieu/deepfake-detection-faces-sample
Explore at:
zip(2889391689 bytes)Available download formats
Dataset updated
Feb 4, 2020
Authors
Hieu Phung
Description
Context

This dataset includes all detectable faces of the sample training dataset in Deepfake Detection Challenge. Kaggle and the host expected and encouraged us to train our models outside of Kaggle’s notebooks environment; however, for someone who prefers to stick to Kaggle's kernels, this dataset would help a lot 😄.
SVDD Challenge 2024: A Singing Voice Deepfake Detection Challenge (CtrSVDD...
zenodo.org
zip
Updated Mar 2, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
You Zhang; You Zhang; Yongyi Zang; Jiatong Shi; Ryuichi Yamamoto; Jionghao Han; Yuxun Tang; Shengyuan Xu; Wenxiao Zhao; Jing Guo; Tomoki Toda; Zhiyao Duan; Zhiyao Duan; Yongyi Zang; Jiatong Shi; Ryuichi Yamamoto; Jionghao Han; Yuxun Tang; Shengyuan Xu; Wenxiao Zhao; Jing Guo; Tomoki Toda (2024). SVDD Challenge 2024: A Singing Voice Deepfake Detection Challenge (CtrSVDD Track, Test Set) [Dataset]. http://doi.org/10.5281/zenodo.10742049
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10742049
Dataset updated
Mar 2, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
You Zhang; You Zhang; Yongyi Zang; Jiatong Shi; Ryuichi Yamamoto; Jionghao Han; Yuxun Tang; Shengyuan Xu; Wenxiao Zhao; Jing Guo; Tomoki Toda; Zhiyao Duan; Zhiyao Duan; Yongyi Zang; Jiatong Shi; Ryuichi Yamamoto; Jionghao Han; Yuxun Tang; Shengyuan Xu; Wenxiao Zhao; Jing Guo; Tomoki Toda
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
For more information about SVDD Challenge 2024, please refer to https://challenge.singfake.org/.

We have released the test set here.
o
ADD 2023 Challenge Track 2 Evaluation Dataset
explore.openaire.eu
Updated Jan 8, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yi, Jiangyan; Zhang, Chu Yuan (2025). ADD 2023 Challenge Track 2 Evaluation Dataset [Dataset]. http://doi.org/10.5281/zenodo.12176904
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.12176904
Dataset updated
Jan 8, 2025
Authors
Yi, Jiangyan; Zhang, Chu Yuan
Description
Audio deepfake detection is an emerging topic in the artificial intelligence community. The second Audio Deepfake Detection Challenge (ADD 2023) aims to spur researchers around the world to build new innovative technologies that can further accelerate and foster research on detecting and analyzing deepfake speech utterances. Different from previous challenges (e.g. ADD 2022), ADD 2023 focuses on surpassing the constraints of binary real/fake classification, and actually localizing the manipulated intervals in a partially fake speech as well as pinpointing the source responsible for generating any fake audio. Furthermore, ADD 2023 includes more rounds of evaluation for the fake audio game sub-challenge. The ADD 2023 challenge (http://addchallenge.cn/add2023) includes three subchallenges: audio fake game (FG), manipulation region location (RL) and deepfake algorithm recognition (AR). This paper describes the datasets, evaluation metrics, and protocols. Some findings are also reported in audio deepfake detection tasks. The ADD 2023 dataset is publicly available. This data set is licensed with a CC BY-NC-ND 4.0 license. If you use this dataset, please cite the following paper: Jiangyan Yi, Jianhua Tao, Ruibo Fu, Xinrui Yan, Chenglong Wang, Tao Wang, Chu Yuan Zhang, Xiaohui Zhang, Yan Zhao, Yong Ren, Le Xu, Junzuo Zhou, Hao Gu, Zhengqi Wen, Shan Liang, Zheng Lian, Shuai Nie, Haizhou Li: ADD 2023: the Second Audio Deepfake Detection Challenge. DADA@IJCAI 2023: 125-130
t
ADD-Dev
service.tib.eu
Updated Dec 2, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). ADD-Dev [Dataset]. https://service.tib.eu/ldmservice/dataset/add-dev
Explore at:
Dataset updated
Dec 2, 2024
Description
The dataset used for the manipulated region location task in the second Audio Deepfake Detection Challenge (ADD 2023).
ADD 2023 Challenge Track 3 Evaluation Dataset
zenodo.org
bin
Updated Jul 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jiangyan Yi; Chu Yuan Zhang; Jiangyan Yi; Chu Yuan Zhang (2024). ADD 2023 Challenge Track 3 Evaluation Dataset [Dataset]. http://doi.org/10.5281/zenodo.12179884
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.12179884
Dataset updated
Jul 26, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Jiangyan Yi; Chu Yuan Zhang; Jiangyan Yi; Chu Yuan Zhang
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description

Audio deepfake detection is an emerging topic in the artificial intelligence community. The second Audio Deepfake Detection Challenge (ADD 2023) aims to spur researchers around the world to build new innovative technologies that can further accelerate and foster research on detecting and analyzing deepfake speech utterances. Different from previous challenges (e.g. ADD 2022), ADD 2023 focuses on surpassing the constraints of binary real/fake classification, and actually localizing the manipulated intervals in a partially fake speech as well as pinpointing the source responsible for generating any fake audio. Furthermore, ADD 2023 includes more rounds of evaluation for the fake audio game sub-challenge. The ADD 2023 challenge (http://addchallenge.cn/add2023) includes three subchallenges: audio fake game (FG), manipulation region location (RL) and deepfake algorithm recognition (AR). This paper describes the datasets, evaluation metrics, and protocols. Some findings are also reported in audio deepfake detection tasks.

The ADD 2023 dataset is publicly available.

This data set is licensed with a CC BY-NC-ND 4.0 license.

If you use this dataset, please cite the following paper:

Jiangyan Yi, Jianhua Tao, Ruibo Fu, Xinrui Yan, Chenglong Wang, Tao Wang, Chu Yuan Zhang, Xiaohui Zhang, Yan Zhao, Yong Ren, Le Xu, Junzuo Zhou, Hao Gu, Zhengqi Wen, Shan Liang, Zheng Lian, Shuai Nie, Haizhou Li: ADD 2023: the Second Audio Deepfake Detection Challenge. DADA@IJCAI 2023: 125-130
Flickr-Faces-HQ Dataset (Nvidia) - Part 8
kaggle.com
Updated Dec 23, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
xhlulu (2019). Flickr-Faces-HQ Dataset (Nvidia) - Part 8 [Dataset]. https://www.kaggle.com/xhlulu/flickrfaceshq-dataset-nvidia-part-8/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 23, 2019
Dataset provided by
Kagglehttp://kaggle.com/
Authors
xhlulu
Description
Original Thread: https://www.kaggle.com/c/deepfake-detection-challenge/discussion/122786

All the links:

Part 1

Part 2

Part 3

Part 4

Part 5

Part 6

Part 7

Part 8

Part 9
ADD 2023 Challenge Track 3 Training/Development Dataset
zenodo.org
application/gzip, bin
Updated Jul 26, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jiangyan Yi; Chu Yuan Zhang; Jiangyan Yi; Chu Yuan Zhang (2024). ADD 2023 Challenge Track 3 Training/Development Dataset [Dataset]. http://doi.org/10.5281/zenodo.12179632
Explore at:
bin, application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.12179632
Dataset updated
Jul 26, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Jiangyan Yi; Chu Yuan Zhang; Jiangyan Yi; Chu Yuan Zhang
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description

Audio deepfake detection is an emerging topic in the artificial intelligence community. The second Audio Deepfake Detection Challenge (ADD 2023) aims to spur researchers around the world to build new innovative technologies that can further accelerate and foster research on detecting and analyzing deepfake speech utterances. Different from previous challenges (e.g. ADD 2022), ADD 2023 focuses on surpassing the constraints of binary real/fake classification, and actually localizing the manipulated intervals in a partially fake speech as well as pinpointing the source responsible for generating any fake audio. Furthermore, ADD 2023 includes more rounds of evaluation for the fake audio game sub-challenge. The ADD 2023 challenge (http://addchallenge.cn/add2023) includes three subchallenges: audio fake game (FG), manipulation region location (RL) and deepfake algorithm recognition (AR). This paper describes the datasets, evaluation metrics, and protocols. Some findings are also reported in audio deepfake detection tasks.

The ADD 2023 dataset is publicly available.

This data set is licensed with a CC BY-NC-ND 4.0 license.

If you use this dataset, please cite the following paper:

Jiangyan Yi, Jianhua Tao, Ruibo Fu, Xinrui Yan, Chenglong Wang, Tao Wang, Chu Yuan Zhang, Xiaohui Zhang, Yan Zhao, Yong Ren, Le Xu, Junzuo Zhou, Hao Gu, Zhengqi Wen, Shan Liang, Zheng Lian, Shuai Nie, Haizhou Li: ADD 2023: the Second Audio Deepfake Detection Challenge. DADA@IJCAI 2023: 125-130
The track 2 evaluation dataset of ADD 2022
zenodo.org
bin, txt, zip
Updated Jul 2, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jiangyan Yi; Xiaohui Zhang; Xiaohui Zhang; Jiangyan Yi (2024). The track 2 evaluation dataset of ADD 2022 [Dataset]. http://doi.org/10.5281/zenodo.12187997
Explore at:
txt, bin, zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.12187997
Dataset updated
Jul 2, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Jiangyan Yi; Xiaohui Zhang; Xiaohui Zhang; Jiangyan Yi
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
Audio deepfake detection is an emerging topic, which was included in the ASVspoof 2021. However, the recent shared tasks have not covered many real-life and challenging scenarios. The first Audio Deep synthesis Detection challenge (ADD) was motivated to fill in the gap. The ADD 2022 (http://addchallenge.cn/add2022) includes three tracks: low-quality fake audio detection (LF), partially fake audio detection (PF) and audio fake game (FG). The LF track focuses on dealing with bona fide and fully fake utterances with various real-world noises etc. The PF track aims to distinguish the partially fake audio from the real. The FG track is a rivalry game, which includes two tasks: an audio generation task and an audio fake detection task. In this paper, we describe the datasets, evaluation metrics, and protocols. We also report major findings that reflect the recent advances in audio deepfake detection tasks.

The ADD 2022 dataset is publicly available.

This data set is licensed with a CC BY-NC-ND 4.0 license.

If you use this dataset, please cite the following paper:

Jiangyan Yi, Ruibo Fu, Jianhua Tao, Shuai Nie, Haoxin Ma, Chenglong Wang, Tao Wang, Zhengkun Tian, Ye Bai, Cunhang Fan, Shan Liang, Shiming Wang, Shuai Zhang, Xinrui Yan, Le Xu, Zhengqi Wen, Haizhou Li:
ADD 2022: the first Audio Deep Synthesis Detection Challenge. ICASSP 2022: 9216-9220
Global Fake Image Detection Market Size By Component (Software, Services),...
verifiedmarketresearch.com
Updated Apr 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
VERIFIED MARKET RESEARCH (2024). Global Fake Image Detection Market Size By Component (Software, Services), By Application (Incident Reporting, Cyber Defense), By Geographic Scope And Forecast [Dataset]. https://www.verifiedmarketresearch.com/product/fake-image-detection-market/
Explore at:
Dataset updated
Apr 8, 2024
Dataset provided by
Verified Market Researchhttps://www.verifiedmarketresearch.com/
Authors
VERIFIED MARKET RESEARCH
License
https://www.verifiedmarketresearch.com/privacy-policy/https://www.verifiedmarketresearch.com/privacy-policy/
Time period covered
2024 - 2031
Area covered
Global
Description
Fake Image Detection Market size was valued at USD 276.65 Million in 2024 and is projected to reach USD 1417.59 Million by 2031, growing at a CAGR of 22.66% from 2024 to 2031.

Global Fake Image Detection Market Overview

The widespread availability of image editing software and social media platforms has led to a surge in fake images, including digitally altered photos and manipulated visual content. This trend has fueled the demand for advanced detection solutions capable of identifying and flagging fake images in real-time. With the proliferation of fake news and misinformation online, there is an increasing awareness among consumers, businesses, and governments about the importance of combating digital fraud and preserving the authenticity of visual content. This heightened concern is driving investments in fake image detection technologies to mitigate the risks associated with misinformation.

However, despite advancements in AI and ML, detecting fake images remains a complex and challenging task, especially when dealing with sophisticated techniques such as deepfakes and generative adversarial networks (GANs). Developing robust detection algorithms capable of identifying increasingly sophisticated forms of image manipulation poses a significant challenge for researchers and developers. The deployment of fake image detection technologies raises concerns about privacy and data ethics, particularly regarding the collection and analysis of visual content shared online. Balancing the need for effective detection with respect for user privacy and ethical considerations remains a key challenge for stakeholders in the Fake Image Detection Market.

Codecfake dataset - training set (part 2 of 3)

zenodo.org

bin

Updated May 16, 2024

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

Yuankun Xie; Yuankun Xie (2024). Codecfake dataset - training set (part 2 of 3) [Dataset]. http://doi.org/10.5281/zenodo.11171720

Explore at:

binAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.11171720

Dataset updated

May 16, 2024

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Yuankun Xie; Yuankun Xie

License

Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically

Description

This dataset is the training set (part 2 of 3) of the Codecfake dataset , corresponding to the manuscript "The Codecfake Dataset and Countermeasures for Universal Deepfake Audio Detection".

Abstract

With the proliferation of Audio Language Model (ALM) based deepfake audio, there is an urgent need for effective detection methods. Unlike traditional deepfake audio generation, which often involves multi-step processes culminating in vocoder usage, ALM directly utilizes neural codec methods to decode discrete codes into audio. Moreover, driven by large-scale data, ALMs exhibit remarkable robustness and versatility, posing a significant challenge to current audio deepfake detection (ADD)
models. To effectively detect ALM-based deepfake audio, we focus on the mechanism of the ALM-based audio generation method, the conversion from neural codec to waveform. We initially construct the Codecfake dataset, an open-source large-scale dataset, including two languages, millions of audio samples, and various test conditions, tailored for ALM-based audio detection. Additionally, to achieve universal detection of deepfake audio and tackle domain ascent bias issue of original SAM, we propose
the CSAM strategy to learn a domain balanced and generalized minima. Experiment results demonstrate that co-training on Codecfake dataset and vocoded dataset with CSAM strategy yield the lowest average Equal Error Rate (EER) of 0.616% across all test conditions compared to baseline models.

Codecfake Dataset

Due to platform restrictions on the size of zenodo repositories, we have divided the Codecfake dataset into various subsets as shown in the table below:

Codecfake dataset	description	link
training set (part 1 of 3) & label	train_split.zip & train_split.z01 - train_split.z06	https://zenodo.org/records/11171708
training set (part 2 of 3)	train_split.z07 - train_split.z14	https://zenodo.org/records/11171720
training set (part 3 of 3)	train_split.z15 - train_split.z19	https://zenodo.org/records/11171724
development set	dev_split.zip & dev_split.z01 - dev_split.z02	https://zenodo.org/records/11169872
test set (part 1 of 2)	Codec test: C1.zip - C6.cip & ALM test: A1.zip - A3.zip	https://zenodo.org/records/11169781
test set (part 2 of 2)	Codec unseen test: C7.zip	https://zenodo.org/records/11125029

Countermeasure

The source code of the countermeasure and pre-trained model are available on GitHub https://github.com/xieyuankun/Codecfake.

The Codecfake dataset and pre-trained model are licensed with CC BY-NC-ND 4.0 license.

Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Brian Dolhansky; Russ Howes; Ben Pflaum; Nicole Baram; Cristian Canton Ferrer, DFDC Dataset [Dataset]. https://paperswithcode.com/dataset/dfdc

DFDC Dataset

Deepfake Detection Challenge

Explore at:

6 scholarly articles cite this dataset (View in Google Scholar)

Authors

Brian Dolhansky; Russ Howes; Ben Pflaum; Nicole Baram; Cristian Canton Ferrer

Description

The DFDC (Deepfake Detection Challenge) is a dataset for deepface detection consisting of more than 100,000 videos.

The DFDC dataset consists of two versions:

Preview dataset. with 5k videos. Featuring two facial modification algorithms. Full dataset, with 124k videos. Featuring eight facial modification algorithms

Clear search

Close search

Google apps

Main menu

DFDC Dataset

Open Media Forensics Challenge (OpenMFC) Evaluation Datasets

Deepfake Detection - Faces - Sample

Context

SVDD Challenge 2024: A Singing Voice Deepfake Detection Challenge (CtrSVDD...

ADD 2023 Challenge Track 2 Evaluation Dataset

ADD-Dev

ADD 2023 Challenge Track 3 Evaluation Dataset

Flickr-Faces-HQ Dataset (Nvidia) - Part 8

ADD 2023 Challenge Track 3 Training/Development Dataset

The track 2 evaluation dataset of ADD 2022

Global Fake Image Detection Market Size By Component (Software, Services),...

Codecfake dataset - training set (part 2 of 3)

Abstract

Codecfake Dataset

Countermeasure

DFDC DatasetSee More Versions

Deepfake Detection Challenge

DFDC Dataset