10 datasets found

P
ISOT Fake News Dataset Dataset
paperswithcode.com
Updated Jul 15, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2018). ISOT Fake News Dataset Dataset [Dataset]. https://paperswithcode.com/dataset/isot-fake-news-dataset
Explore at:
Dataset updated
Jul 15, 2018
Description
The ISOT Fake News dataset is a compilation of several thousands fake news and truthful articles, obtained from different legitimate news sites and sites flagged as unreliable by Politifact.com.
FAD: A Chinese Dataset for Fake Audio Detection
zenodo.org
data.niaid.nih.gov
zip
Updated Jul 9, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Haoxin Ma; Jiangyan Yi; Haoxin Ma; Jiangyan Yi (2023). FAD: A Chinese Dataset for Fake Audio Detection [Dataset]. http://doi.org/10.5281/zenodo.6635521
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.6635521
Dataset updated
Jul 9, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Haoxin Ma; Jiangyan Yi; Haoxin Ma; Jiangyan Yi
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
Fake audio detection is a growing concern and some relevant datasets have been designed for research. But there is no standard public Chinese dataset under additive noise conditions. In this paper, we aim to fill in the gap and design a
Chinese fake audio detection dataset (FAD) for studying more generalized detection methods. Twelve mainstream speech generation techniques are used to generate fake audios. To simulate the real-life scenarios, three noise datasets are selected for
noisy adding at five different signal noise ratios. FAD dataset can be used not only for fake audio detection, but also for detecting the algorithms of fake utterances for
audio forensics. Baseline results are presented with analysis. The results that show fake audio detection methods with generalization remain challenging.
The FAD dataset is publicly available. The source code of baselines is available on GitHub https://github.com/ADDchallenge/FAD

The FAD dataset is designed to evaluate the methods of fake audio detection and fake algorithms recognition and other relevant studies. To better study the robustness of the methods under noisy
conditions when applied in real life, we construct the corresponding noisy dataset. The total FAD dataset consists of two versions: clean version and noisy version. Both versions are divided into
disjoint training, development and test sets in the same way. There is no speaker overlap across these three subsets. Each test sets is further divided into seen and unseen test sets. Unseen test sets can
evaluate the generalization of the methods to unknown types. It is worth mentioning that both real audios and fake audios in the unseen test set are unknown to the model.
For the noisy speech part, we select three noise database for simulation. Additive noises are added to each audio in the clean dataset at 5 different SNRs. The additive noises of the unseen test set and the
remaining subsets come from different noise databases. In each version of FAD dataset, there are 138400 utterances in training set, 14400 utterances in development set, 42000 utterances in seen test set, and 21000 utterances in unseen test set. More detailed statistics are demonstrated in the Tabel 2.

Clean Real Audios Collection
From the point of eliminating the interference of irrelevant factors, we collect clean real audios from
two aspects: 5 open resources from OpenSLR platform (http://www.openslr.org/12/) and one self-recording dataset.

Clean Fake Audios Generation
We select 11 representative speech synthesis methods to generate the fake audios and one partially fake audios.

Noisy Audios Simulation
Noisy audios aim to quantify the robustness of the methods under noisy conditions. To simulate the real-life scenarios, we artificially sample the noise signals and add them to clean audios at 5 different
SNRs, which are 0dB, 5dB, 10dB, 15dB and 20dB. Additive noises are selected from three noise databases: PNL 100 Nonspeech Sounds, NOISEX-92, and TAU Urban Acoustic Scenes.

This data set is licensed with a CC BY-NC-ND 4.0 license.
You can cite the data using the following BibTeX entry:
@inproceedings{ma2022fad,
title={FAD: A Chinese Dataset for Fake Audio Detection},
author={Haoxin Ma, Jiangyan Yi, Chenglong Wang, Xunrui Yan, Jianhua Tao, Tao Wang, Shiming Wang, Le Xu, Ruibo Fu},
booktitle={Submitted to the 36th Conference on Neural Information Processing Systems (NeurIPS 2022) Track on Datasets and Benchmarks },
year={2022},
}
CNN reporting made up or fake news about Trump in the U.S. 2017
statista.com
Updated Feb 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). CNN reporting made up or fake news about Trump in the U.S. 2017 [Dataset]. https://www.statista.com/statistics/784059/cnn-fake-news-trump/
Explore at:
Dataset updated
Feb 13, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Nov 2, 2017 - Nov 6, 2017
Area covered
United States
Description
Around 23 percent of Americans stated that they strongly agreed that CNN regularly reports made up or fake news about Donald Trump and his administration. An additional 16 percent strongly disagreed with this statement, and 15 percent had no opinion, despite the divisive subject matter.

CNN

CNN ranks as one of the most popular news networks in the United States and boasts successful affiliates which can be accessed by people in over 200 countries around the world. Over 45 percent of Americans report that they watch the network, and it is generally seen as a credible source of news and information. Over half of Americans find the network to be at least somewhat credible, but 21 percent strongly disagreed, implying highly polarized opinions based on political affiliation. Democrats are much more likely to watch CNN than their Republican and Independent counterparts suggesting that the network is at least somewhat left leaning in its coverage.

Fake news

Coined by Donald Trump during the 2016 election cycle, the term ‘fake news’ is often used by the president and his supporters to describe news stories and networks which they believe to be spreading false information. Over 50 percent of Americans believe that online news websites regularly report fake news stories, while only nine percent think otherwise. Fake news is often difficult to identify, and many news consumers in countries across the globe struggle to determine fact from fiction.
Multilingual Fake News Detection Dataset: Gujarati, Hindi, Marathi, and...
zenodo.org
data.niaid.nih.gov
zip
Updated Jun 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kailas Patil; Kailas Patil; Gandhi Parshv; Chauhan Abhishek; Patil Vaibhav; Pawar Ameya; Gandhi Parshv; Chauhan Abhishek; Patil Vaibhav; Pawar Ameya (2024). Multilingual Fake News Detection Dataset: Gujarati, Hindi, Marathi, and Telugu [Dataset]. http://doi.org/10.5281/zenodo.11408513
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.11408513
Dataset updated
Jun 1, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Kailas Patil; Kailas Patil; Gandhi Parshv; Chauhan Abhishek; Patil Vaibhav; Pawar Ameya; Gandhi Parshv; Chauhan Abhishek; Patil Vaibhav; Pawar Ameya
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is designed to support research in fake news detection across four major Indian languages: Gujarati, Hindi, Marathi, and Telugu. The dataset includes a diverse set of news articles collected from various sources, each labeled as either 'fake' or 'real'. The primary goal is to provide a resource that helps in the development and evaluation of natural language processing (NLP) models capable of detecting fake news in these regional languages.
R
Fake Face Vs Real Face Dataset
universe.roboflow.com
zip
Updated Dec 16, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
yolo touch project (2023). Fake Face Vs Real Face Dataset [Dataset]. https://universe.roboflow.com/yolo-touch-project/fake-face-vs-real-face
Explore at:
zipAvailable download formats
Dataset updated
Dec 16, 2023
Dataset authored and provided by
yolo touch project
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Fake Face Bounding Boxes
Description
Fake Face Vs Real Face

## Overview Fake Face Vs Real Face is a dataset for object detection tasks - it contains Fake Face annotations for 494 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
fakevoiceData
kaggle.com
zip
Updated Jul 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
hyeojuKim (2024). fakevoiceData [Dataset]. https://www.kaggle.com/datasets/hyeojukim/fakevoicedata/discussion
Explore at:
zip(3314801099 bytes)Available download formats
Dataset updated
Jul 3, 2024
Authors
hyeojuKim
Description
Dataset

This dataset was created by hyeojuKim

Contents
Fake or real news Dataset
kaggle.com
zip
Updated May 23, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yashvardhan Thakker (2023). Fake or real news Dataset [Dataset]. https://www.kaggle.com/datasets/yashvardhanthakker/fake-or-real-news-dataset/code
Explore at:
zip(38841253 bytes)Available download formats
Dataset updated
May 23, 2023
Authors
Yashvardhan Thakker
Description
Dataset

This dataset was created by Yashvardhan Thakker

Contents
Fake or Real News
kaggle.com
zip
Updated Dec 30, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ekatra (2020). Fake or Real News [Dataset]. https://www.kaggle.com/ekatra/fake-or-real-news
Explore at:
zip(14038853 bytes)Available download formats
Dataset updated
Dec 30, 2020
Authors
Ekatra
Description
Dataset

This dataset was created by Ekatra

Released under Other (specified in description)

Contents
Dataset of fake identity documents for research purposes
zenodo.org
zip
Updated Aug 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CACTUS Lab; CACTUS Lab (2023). Dataset of fake identity documents for research purposes [Dataset]. http://doi.org/10.5281/zenodo.8213348
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8213348
Dataset updated
Aug 4, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
CACTUS Lab; CACTUS Lab
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset of fake Spanish ID documents to train fake ID detectors on. Base material taken from MIDV 2020 dataset.
Mobile app install frauds on Android 2022, by category
statista.com
Updated May 11, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mobile app install frauds on Android 2022, by category [Dataset]. https://www.statista.com/statistics/1380428/app-install-fraud-by-app-category/
Explore at:
Dataset updated
May 11, 2023
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2022
Area covered
Worldwide
Description
In 2022, mobile app install frauds across all examined categories on Android devices made us for the most bot operations. Bots were used in around 75 percent of the fraudulent installs of finance apps, as well as in over 75 percent of all fake installs of social apps. Midcore gaming apps were more likely than other app categories to be targeted by click flooding as fraudulent installs, while hypercasual gaming apps saw approximately 24 percent of their fraudulent installs coming from fake publisher activities.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

(2018). ISOT Fake News Dataset Dataset [Dataset]. https://paperswithcode.com/dataset/isot-fake-news-dataset

ISOT Fake News Dataset Dataset

Explore at:

5 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Jul 15, 2018

Description

The ISOT Fake News dataset is a compilation of several thousands fake news and truthful articles, obtained from different legitimate news sites and sites flagged as unreliable by Politifact.com.

Clear search

Close search

Google apps

Main menu

ISOT Fake News Dataset Dataset

FAD: A Chinese Dataset for Fake Audio Detection

CNN reporting made up or fake news about Trump in the U.S. 2017

Multilingual Fake News Detection Dataset: Gujarati, Hindi, Marathi, and...

Fake Face Vs Real Face Dataset

Fake Face Vs Real Face

fakevoiceData

Dataset

Contents

Fake or real news Dataset

Dataset

Contents

Fake or Real News

Dataset

Contents

Dataset of fake identity documents for research purposes

Mobile app install frauds on Android 2022, by category

ISOT Fake News Dataset DatasetSee More Versions

ISOT Fake News Dataset Dataset