12 datasets found
  1. P

    DFDC Dataset

    • paperswithcode.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brian Dolhansky; Russ Howes; Ben Pflaum; Nicole Baram; Cristian Canton Ferrer, DFDC Dataset [Dataset]. https://paperswithcode.com/dataset/dfdc
    Explore at:
    Authors
    Brian Dolhansky; Russ Howes; Ben Pflaum; Nicole Baram; Cristian Canton Ferrer
    Description

    The DFDC (Deepfake Detection Challenge) is a dataset for deepface detection consisting of more than 100,000 videos.

    The DFDC dataset consists of two versions:

    Preview dataset. with 5k videos. Featuring two facial modification algorithms. Full dataset, with 124k videos. Featuring eight facial modification algorithms

  2. Open Media Forensics Challenge (OpenMFC) Evaluation Datasets

    • data.nist.gov
    • catalog.data.gov
    Updated Mar 4, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Haiying Guan (2022). Open Media Forensics Challenge (OpenMFC) Evaluation Datasets [Dataset]. http://doi.org/10.18434/mds2-2410
    Explore at:
    Dataset updated
    Mar 4, 2022
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Authors
    Haiying Guan
    License

    https://www.nist.gov/open/licensehttps://www.nist.gov/open/license

    Description

    The datasets contain the following parts for Open Media Forensics Challenge (OpenMFC) evaluations: 1. NC16 Kickoff dataset 2. NC17 development and evaluation datasets 3. MFC18 development and evaluation datasets 4. MFC19 development and evaluation datasets 5. MFC20 development and evaluation datasets 6. OpenMFC2022 steg datasets 7. OpenMFC2022 deepfake datasets

  3. Deepfake Detection - Faces - Sample

    • kaggle.com
    zip
    Updated Feb 4, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hieu Phung (2020). Deepfake Detection - Faces - Sample [Dataset]. https://www.kaggle.com/phunghieu/deepfake-detection-faces-sample
    Explore at:
    zip(2889391689 bytes)Available download formats
    Dataset updated
    Feb 4, 2020
    Authors
    Hieu Phung
    Description

    Context

    This dataset includes all detectable faces of the sample training dataset in Deepfake Detection Challenge. Kaggle and the host expected and encouraged us to train our models outside of Kaggle’s notebooks environment; however, for someone who prefers to stick to Kaggle's kernels, this dataset would help a lot 😄.

  4. SVDD Challenge 2024: A Singing Voice Deepfake Detection Challenge (CtrSVDD...

    • zenodo.org
    zip
    Updated Mar 2, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    You Zhang; You Zhang; Yongyi Zang; Jiatong Shi; Ryuichi Yamamoto; Jionghao Han; Yuxun Tang; Shengyuan Xu; Wenxiao Zhao; Jing Guo; Tomoki Toda; Zhiyao Duan; Zhiyao Duan; Yongyi Zang; Jiatong Shi; Ryuichi Yamamoto; Jionghao Han; Yuxun Tang; Shengyuan Xu; Wenxiao Zhao; Jing Guo; Tomoki Toda (2024). SVDD Challenge 2024: A Singing Voice Deepfake Detection Challenge (CtrSVDD Track, Test Set) [Dataset]. http://doi.org/10.5281/zenodo.10742049
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 2, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    You Zhang; You Zhang; Yongyi Zang; Jiatong Shi; Ryuichi Yamamoto; Jionghao Han; Yuxun Tang; Shengyuan Xu; Wenxiao Zhao; Jing Guo; Tomoki Toda; Zhiyao Duan; Zhiyao Duan; Yongyi Zang; Jiatong Shi; Ryuichi Yamamoto; Jionghao Han; Yuxun Tang; Shengyuan Xu; Wenxiao Zhao; Jing Guo; Tomoki Toda
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    For more information about SVDD Challenge 2024, please refer to https://challenge.singfake.org/.

    We have released the test set here.

  5. o

    ADD 2023 Challenge Track 2 Evaluation Dataset

    • explore.openaire.eu
    Updated Jan 8, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yi, Jiangyan; Zhang, Chu Yuan (2025). ADD 2023 Challenge Track 2 Evaluation Dataset [Dataset]. http://doi.org/10.5281/zenodo.12176904
    Explore at:
    Dataset updated
    Jan 8, 2025
    Authors
    Yi, Jiangyan; Zhang, Chu Yuan
    Description

    Audio deepfake detection is an emerging topic in the artificial intelligence community. The second Audio Deepfake Detection Challenge (ADD 2023) aims to spur researchers around the world to build new innovative technologies that can further accelerate and foster research on detecting and analyzing deepfake speech utterances. Different from previous challenges (e.g. ADD 2022), ADD 2023 focuses on surpassing the constraints of binary real/fake classification, and actually localizing the manipulated intervals in a partially fake speech as well as pinpointing the source responsible for generating any fake audio. Furthermore, ADD 2023 includes more rounds of evaluation for the fake audio game sub-challenge. The ADD 2023 challenge (http://addchallenge.cn/add2023) includes three subchallenges: audio fake game (FG), manipulation region location (RL) and deepfake algorithm recognition (AR). This paper describes the datasets, evaluation metrics, and protocols. Some findings are also reported in audio deepfake detection tasks. The ADD 2023 dataset is publicly available. This data set is licensed with a CC BY-NC-ND 4.0 license. If you use this dataset, please cite the following paper: Jiangyan Yi, Jianhua Tao, Ruibo Fu, Xinrui Yan, Chenglong Wang, Tao Wang, Chu Yuan Zhang, Xiaohui Zhang, Yan Zhao, Yong Ren, Le Xu, Junzuo Zhou, Hao Gu, Zhengqi Wen, Shan Liang, Zheng Lian, Shuai Nie, Haizhou Li: ADD 2023: the Second Audio Deepfake Detection Challenge. DADA@IJCAI 2023: 125-130

  6. t

    ADD-Dev

    • service.tib.eu
    Updated Dec 2, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). ADD-Dev [Dataset]. https://service.tib.eu/ldmservice/dataset/add-dev
    Explore at:
    Dataset updated
    Dec 2, 2024
    Description

    The dataset used for the manipulated region location task in the second Audio Deepfake Detection Challenge (ADD 2023).

  7. ADD 2023 Challenge Track 3 Evaluation Dataset

    • zenodo.org
    bin
    Updated Jul 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jiangyan Yi; Chu Yuan Zhang; Jiangyan Yi; Chu Yuan Zhang (2024). ADD 2023 Challenge Track 3 Evaluation Dataset [Dataset]. http://doi.org/10.5281/zenodo.12179884
    Explore at:
    binAvailable download formats
    Dataset updated
    Jul 26, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jiangyan Yi; Chu Yuan Zhang; Jiangyan Yi; Chu Yuan Zhang
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description
    Audio deepfake detection is an emerging topic in the artificial intelligence community. The second Audio Deepfake Detection Challenge (ADD 2023) aims to spur researchers around the world to build new innovative technologies that can further accelerate and foster research on detecting and analyzing deepfake speech utterances. Different from previous challenges (e.g. ADD 2022), ADD 2023 focuses on surpassing the constraints of binary real/fake classification, and actually localizing the manipulated intervals in a partially fake speech as well as pinpointing the source responsible for generating any fake audio. Furthermore, ADD 2023 includes more rounds of evaluation for the fake audio game sub-challenge. The ADD 2023 challenge (http://addchallenge.cn/add2023) includes three subchallenges: audio fake game (FG), manipulation region location (RL) and deepfake algorithm recognition (AR). This paper describes the datasets, evaluation metrics, and protocols. Some findings are also reported in audio deepfake detection tasks.


    The ADD 2023 dataset is publicly available.

    This data set is licensed with a CC BY-NC-ND 4.0 license.

    If you use this dataset, please cite the following paper:
    Jiangyan Yi, Jianhua Tao, Ruibo Fu, Xinrui Yan, Chenglong Wang, Tao Wang, Chu Yuan Zhang, Xiaohui Zhang, Yan Zhao, Yong Ren, Le Xu, Junzuo Zhou, Hao Gu, Zhengqi Wen, Shan Liang, Zheng Lian, Shuai Nie, Haizhou Li: ADD 2023: the Second Audio Deepfake Detection Challenge. DADA@IJCAI 2023: 125-130

  8. Flickr-Faces-HQ Dataset (Nvidia) - Part 8

    • kaggle.com
    Updated Dec 23, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    xhlulu (2019). Flickr-Faces-HQ Dataset (Nvidia) - Part 8 [Dataset]. https://www.kaggle.com/xhlulu/flickrfaceshq-dataset-nvidia-part-8/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 23, 2019
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    xhlulu
    Description
  9. ADD 2023 Challenge Track 3 Training/Development Dataset

    • zenodo.org
    application/gzip, bin
    Updated Jul 26, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jiangyan Yi; Chu Yuan Zhang; Jiangyan Yi; Chu Yuan Zhang (2024). ADD 2023 Challenge Track 3 Training/Development Dataset [Dataset]. http://doi.org/10.5281/zenodo.12179632
    Explore at:
    bin, application/gzipAvailable download formats
    Dataset updated
    Jul 26, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jiangyan Yi; Chu Yuan Zhang; Jiangyan Yi; Chu Yuan Zhang
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description
    Audio deepfake detection is an emerging topic in the artificial intelligence community. The second Audio Deepfake Detection Challenge (ADD 2023) aims to spur researchers around the world to build new innovative technologies that can further accelerate and foster research on detecting and analyzing deepfake speech utterances. Different from previous challenges (e.g. ADD 2022), ADD 2023 focuses on surpassing the constraints of binary real/fake classification, and actually localizing the manipulated intervals in a partially fake speech as well as pinpointing the source responsible for generating any fake audio. Furthermore, ADD 2023 includes more rounds of evaluation for the fake audio game sub-challenge. The ADD 2023 challenge (http://addchallenge.cn/add2023) includes three subchallenges: audio fake game (FG), manipulation region location (RL) and deepfake algorithm recognition (AR). This paper describes the datasets, evaluation metrics, and protocols. Some findings are also reported in audio deepfake detection tasks.


    The ADD 2023 dataset is publicly available.

    This data set is licensed with a CC BY-NC-ND 4.0 license.

    If you use this dataset, please cite the following paper:
    Jiangyan Yi, Jianhua Tao, Ruibo Fu, Xinrui Yan, Chenglong Wang, Tao Wang, Chu Yuan Zhang, Xiaohui Zhang, Yan Zhao, Yong Ren, Le Xu, Junzuo Zhou, Hao Gu, Zhengqi Wen, Shan Liang, Zheng Lian, Shuai Nie, Haizhou Li: ADD 2023: the Second Audio Deepfake Detection Challenge. DADA@IJCAI 2023: 125-130

  10. The track 2 evaluation dataset of ADD 2022

    • zenodo.org
    bin, txt, zip
    Updated Jul 2, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jiangyan Yi; Xiaohui Zhang; Xiaohui Zhang; Jiangyan Yi (2024). The track 2 evaluation dataset of ADD 2022 [Dataset]. http://doi.org/10.5281/zenodo.12187997
    Explore at:
    txt, bin, zipAvailable download formats
    Dataset updated
    Jul 2, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jiangyan Yi; Xiaohui Zhang; Xiaohui Zhang; Jiangyan Yi
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Audio deepfake detection is an emerging topic, which was included in the ASVspoof 2021. However, the recent shared tasks have not covered many real-life and challenging scenarios. The first Audio Deep synthesis Detection challenge (ADD) was motivated to fill in the gap. The ADD 2022 (http://addchallenge.cn/add2022) includes three tracks: low-quality fake audio detection (LF), partially fake audio detection (PF) and audio fake game (FG). The LF track focuses on dealing with bona fide and fully fake utterances with various real-world noises etc. The PF track aims to distinguish the partially fake audio from the real. The FG track is a rivalry game, which includes two tasks: an audio generation task and an audio fake detection task. In this paper, we describe the datasets, evaluation metrics, and protocols. We also report major findings that reflect the recent advances in audio deepfake detection tasks.

    The ADD 2022 dataset is publicly available.

    This data set is licensed with a CC BY-NC-ND 4.0 license.

    If you use this dataset, please cite the following paper:

    Jiangyan Yi, Ruibo Fu, Jianhua Tao, Shuai Nie, Haoxin Ma, Chenglong Wang, Tao Wang, Zhengkun Tian, Ye Bai, Cunhang Fan, Shan Liang, Shiming Wang, Shuai Zhang, Xinrui Yan, Le Xu, Zhengqi Wen, Haizhou Li:
    ADD 2022: the first Audio Deep Synthesis Detection Challenge. ICASSP 2022: 9216-9220

  11. Global Fake Image Detection Market Size By Component (Software, Services),...

    • verifiedmarketresearch.com
    Updated Apr 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    VERIFIED MARKET RESEARCH (2024). Global Fake Image Detection Market Size By Component (Software, Services), By Application (Incident Reporting, Cyber Defense), By Geographic Scope And Forecast [Dataset]. https://www.verifiedmarketresearch.com/product/fake-image-detection-market/
    Explore at:
    Dataset updated
    Apr 8, 2024
    Dataset provided by
    Verified Market Researchhttps://www.verifiedmarketresearch.com/
    Authors
    VERIFIED MARKET RESEARCH
    License

    https://www.verifiedmarketresearch.com/privacy-policy/https://www.verifiedmarketresearch.com/privacy-policy/

    Time period covered
    2024 - 2031
    Area covered
    Global
    Description

    Fake Image Detection Market size was valued at USD 276.65 Million in 2024 and is projected to reach USD 1417.59 Million by 2031, growing at a CAGR of 22.66% from 2024 to 2031.

    Global Fake Image Detection Market Overview

    The widespread availability of image editing software and social media platforms has led to a surge in fake images, including digitally altered photos and manipulated visual content. This trend has fueled the demand for advanced detection solutions capable of identifying and flagging fake images in real-time. With the proliferation of fake news and misinformation online, there is an increasing awareness among consumers, businesses, and governments about the importance of combating digital fraud and preserving the authenticity of visual content. This heightened concern is driving investments in fake image detection technologies to mitigate the risks associated with misinformation.

    However, despite advancements in AI and ML, detecting fake images remains a complex and challenging task, especially when dealing with sophisticated techniques such as deepfakes and generative adversarial networks (GANs). Developing robust detection algorithms capable of identifying increasingly sophisticated forms of image manipulation poses a significant challenge for researchers and developers. The deployment of fake image detection technologies raises concerns about privacy and data ethics, particularly regarding the collection and analysis of visual content shared online. Balancing the need for effective detection with respect for user privacy and ethical considerations remains a key challenge for stakeholders in the Fake Image Detection Market.

  12. Codecfake dataset - training set (part 2 of 3)

    • zenodo.org
    bin
    Updated May 16, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yuankun Xie; Yuankun Xie (2024). Codecfake dataset - training set (part 2 of 3) [Dataset]. http://doi.org/10.5281/zenodo.11171720
    Explore at:
    binAvailable download formats
    Dataset updated
    May 16, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Yuankun Xie; Yuankun Xie
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    This dataset is the training set (part 2 of 3) of the Codecfake dataset , corresponding to the manuscript "The Codecfake Dataset and Countermeasures for Universal Deepfake Audio Detection".

    Abstract

    With the proliferation of Audio Language Model (ALM) based deepfake audio, there is an urgent need for effective detection methods. Unlike traditional deepfake audio generation, which often involves multi-step processes culminating in vocoder usage, ALM directly utilizes neural codec methods to decode discrete codes into audio. Moreover, driven by large-scale data, ALMs exhibit remarkable robustness and versatility, posing a significant challenge to current audio deepfake detection (ADD)
    models. To effectively detect ALM-based deepfake audio, we focus on the mechanism of the ALM-based audio generation method, the conversion from neural codec to waveform. We initially construct the Codecfake dataset, an open-source large-scale dataset, including two languages, millions of audio samples, and various test conditions, tailored for ALM-based audio detection. Additionally, to achieve universal detection of deepfake audio and tackle domain ascent bias issue of original SAM, we propose
    the CSAM strategy to learn a domain balanced and generalized minima. Experiment results demonstrate that co-training on Codecfake dataset and vocoded dataset with CSAM strategy yield the lowest average Equal Error Rate (EER) of 0.616% across all test conditions compared to baseline models.

    Codecfake Dataset

    Due to platform restrictions on the size of zenodo repositories, we have divided the Codecfake dataset into various subsets as shown in the table below:

    Codecfake datasetdescriptionlink
    training set (part 1 of 3) & labeltrain_split.zip & train_split.z01 - train_split.z06https://zenodo.org/records/11171708
    training set (part 2 of 3)train_split.z07 - train_split.z14https://zenodo.org/records/11171720
    training set (part 3 of 3)train_split.z15 - train_split.z19https://zenodo.org/records/11171724
    development setdev_split.zip & dev_split.z01 - dev_split.z02https://zenodo.org/records/11169872
    test set (part 1 of 2)Codec test: C1.zip - C6.cip & ALM test: A1.zip - A3.ziphttps://zenodo.org/records/11169781
    test set (part 2 of 2)Codec unseen test: C7.ziphttps://zenodo.org/records/11125029

    Countermeasure

    The source code of the countermeasure and pre-trained model are available on GitHub https://github.com/xieyuankun/Codecfake.

    The Codecfake dataset and pre-trained model are licensed with CC BY-NC-ND 4.0 license.

  13. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Brian Dolhansky; Russ Howes; Ben Pflaum; Nicole Baram; Cristian Canton Ferrer, DFDC Dataset [Dataset]. https://paperswithcode.com/dataset/dfdc

DFDC Dataset

Deepfake Detection Challenge

Explore at:
6 scholarly articles cite this dataset (View in Google Scholar)
Authors
Brian Dolhansky; Russ Howes; Ben Pflaum; Nicole Baram; Cristian Canton Ferrer
Description

The DFDC (Deepfake Detection Challenge) is a dataset for deepface detection consisting of more than 100,000 videos.

The DFDC dataset consists of two versions:

Preview dataset. with 5k videos. Featuring two facial modification algorithms. Full dataset, with 124k videos. Featuring eight facial modification algorithms

Search
Clear search
Close search
Google apps
Main menu