100+ datasets found

Leading social networks used for news in the U.S. 2019-2025
statista.com
Updated Jan 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Amy Watson (2024). Leading social networks used for news in the U.S. 2019-2025 [Dataset]. https://www.statista.com/topics/3251/fake-news/
Explore at:
Dataset updated
Jan 9, 2024
Dataset provided by
Statistahttp://statista.com/
Authors
Amy Watson
Description
In 2025, Facebook remained the most-used social platform for news in the United States, with 32 percent of respondents reporting they accessed news on it. YouTube followed closely at 30 percent, recording a slight increase from the previous year. X (formerly Twitter) saw the most notable growth, rising by eight percent to 23 percent.
C
Fake News Statistics By Impacts, AI, Country, Misinformation, Frequency,...
coolest-gadgets.com
Updated Jan 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Coolest Gadgets (2025). Fake News Statistics By Impacts, AI, Country, Misinformation, Frequency, Media Outlets And Economic Losses [Dataset]. https://coolest-gadgets.com/fake-news-statistics/
Explore at:
Dataset updated
Jan 9, 2025
Dataset authored and provided by
Coolest Gadgets
License
https://coolest-gadgets.com/privacy-policyhttps://coolest-gadgets.com/privacy-policy
Time period covered
2022 - 2032
Area covered
Global
Description
Introduction

Fake News Statistics: Fake news has become a major problem in today's digital age in recent years. It spreads quickly through social media and other online platforms, often misleading people. Fake news spreads faster than real news, thus creating confusion and mistrust among global people. In 2024, current statistics and trends reveal that many people have encountered fake news online, and many have shared it unknowingly.

Fake news affects public opinion, political decisions, and even relationships. This article helps us understand how widespread it is and helps us address several issues more effectively. Raising awareness and encouraging critical thinking can reduce its impact, in which reliable statistics and research are essential for uncovering the truth and stopping the spread of false information. Everyone plays a role in combating fake news.
CT-FAN-21 corpus: A dataset for Fake News Detection
zenodo.org
Updated Oct 23, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gautam Kishore Shahi; Julia Maria Struß; Thomas Mandl; Gautam Kishore Shahi; Julia Maria Struß; Thomas Mandl (2022). CT-FAN-21 corpus: A dataset for Fake News Detection [Dataset]. http://doi.org/10.5281/zenodo.4714517
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.4714517
Dataset updated
Oct 23, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Gautam Kishore Shahi; Julia Maria Struß; Thomas Mandl; Gautam Kishore Shahi; Julia Maria Struß; Thomas Mandl
Description
Data Access: The data in the research collection provided may only be used for research purposes. Portions of the data are copyrighted and have commercial value as data, so you must be careful to use it only for research purposes. Due to these restrictions, the collection is not open data. Please download the Agreement at Data Sharing Agreement and send the signed form to fakenewstask@gmail.com .

Citation

Please cite our work as

@article{shahi2021overview, title={Overview of the CLEF-2021 CheckThat! lab task 3 on fake news detection}, author={Shahi, Gautam Kishore and Stru{\ss}, Julia Maria and Mandl, Thomas}, journal={Working Notes of CLEF}, year={2021} }

Problem Definition: Given the text of a news article, determine whether the main claim made in the article is true, partially true, false, or other (e.g., claims in dispute) and detect the topical domain of the article. This task will run in English.

Subtask 3A: Multi-class fake news detection of news articles (English) Sub-task A would detect fake news designed as a four-class classification problem. The training data will be released in batches and roughly about 900 articles with the respective label. Given the text of a news article, determine whether the main claim made in the article is true, partially true, false, or other. Our definitions for the categories are as follows:

False - The main claim made in an article is untrue.

Partially False - The main claim of an article is a mixture of true and false information. The article contains partially true and partially false information but cannot be considered 100% true. It includes all articles in categories like partially false, partially true, mostly true, miscaptioned, misleading etc., as defined by different fact-checking services.

True - This rating indicates that the primary elements of the main claim are demonstrably true.

Other- An article that cannot be categorised as true, false, or partially false due to lack of evidence about its claims. This category includes articles in dispute and unproven articles.

Subtask 3B: Topical Domain Classification of News Articles (English) Fact-checkers require background expertise to identify the truthfulness of an article. The categorisation will help to automate the sampling process from a stream of data. Given the text of a news article, determine the topical domain of the article (English). This is a classification problem. The task is to categorise fake news articles into six topical categories like health, election, crime, climate, election, education. This task will be offered for a subset of the data of Subtask 3A.

Input Data

The data will be provided in the format of Id, title, text, rating, the domain; the description of the columns is as follows:

Task 3a

ID- Unique identifier of the news article

Title- Title of the news article

text- Text mentioned inside the news article

our rating - class of the news article as false, partially false, true, other

Task 3b

public_id- Unique identifier of the news article

Title- Title of the news article

text- Text mentioned inside the news article

domain - domain of the given news article(applicable only for task B)

Output data format

Task 3a

public_id- Unique identifier of the news article

predicted_rating- predicted class

Sample File

public_id, predicted_rating 1, false 2, true

Task 3b

public_id- Unique identifier of the news article

predicted_domain- predicted domain

Sample file

public_id, predicted_domain 1, health 2, crime

Additional data for Training

To train your model, the participant can use additional data with a similar format; some datasets are available over the web. We don't provide the background truth for those datasets. For testing, we will not use any articles from other datasets. Some of the possible source:

Fakenews Classification Datasets

Fake News Detection Challenge KDD 2020

FakeNewsNet

IMPORTANT!

Fake news article used for task 3b is a subset of task 3a.

We have used the data from 2010 to 2021, and the content of fake news is mixed up with several topics like election, COVID-19 etc.

Evaluation Metrics

This task is evaluated as a classification task. We will use the F1-macro measure for the ranking of teams. There is a limit of 5 runs (total and not per day), and only one person from a team is allowed to submit runs.

Submission Link: https://competitions.codalab.org/competitions/31238

Related Work

Shahi GK. AMUSED: An Annotation Framework of Multi-modal Social Media Data. arXiv preprint arXiv:2010.00502. 2020 Oct 1.https://arxiv.org/pdf/2010.00502.pdf

G. K. Shahi and D. Nandini, “FakeCovid – a multilingualcross-domain fact check news dataset for covid-19,” inWorkshop Proceedings of the 14th International AAAIConference on Web and Social Media, 2020. http://workshop-proceedings.icwsm.org/abstract?id=2020_14

Shahi, G. K., Dirkson, A., & Majchrzak, T. A. (2021). An exploratory study of covid-19 misinformation on twitter. Online Social Networks and Media, 22, 100104. doi: 10.1016/j.osnem.2020.100104
Z
CT-FAN: A Multilingual dataset for Fake News Detection
data.niaid.nih.gov
Updated Oct 23, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Melanie Siegel (2022). CT-FAN: A Multilingual dataset for Fake News Detection [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4714516
Explore at:
Dataset updated
Oct 23, 2022
Dataset provided by
Thomas Mandl
Gautam Kishore Shahi
Julia Maria Struß
Michael Wiegand
Melanie Siegel
Juliane Köhler
Description
By downloading the data, you agree with the terms & conditions mentioned below:

Data Access: The data in the research collection may only be used for research purposes. Portions of the data are copyrighted and have commercial value as data, so you must be careful to use them only for research purposes.

Summaries, analyses and interpretations of the linguistic properties of the information may be derived and published, provided it is impossible to reconstruct the information from these summaries. You may not try identifying the individuals whose texts are included in this dataset. You may not try to identify the original entry on the fact-checking site. You are not permitted to publish any portion of the dataset besides summary statistics or share it with anyone else.

We grant you the right to access the collection's content as described in this agreement. You may not otherwise make unauthorised commercial use of, reproduce, prepare derivative works, distribute copies, perform, or publicly display the collection or parts of it. You are responsible for keeping and storing the data in a way that others cannot access. The data is provided free of charge.

Citation

Please cite our work as

@InProceedings{clef-checkthat:2022:task3, author = {K{"o}hler, Juliane and Shahi, Gautam Kishore and Stru{\ss}, Julia Maria and Wiegand, Michael and Siegel, Melanie and Mandl, Thomas}, title = "Overview of the {CLEF}-2022 {CheckThat}! Lab Task 3 on Fake News Detection", year = {2022}, booktitle = "Working Notes of CLEF 2022---Conference and Labs of the Evaluation Forum", series = {CLEF~'2022}, address = {Bologna, Italy},}

@article{shahi2021overview, title={Overview of the CLEF-2021 CheckThat! lab task 3 on fake news detection}, author={Shahi, Gautam Kishore and Stru{\ss}, Julia Maria and Mandl, Thomas}, journal={Working Notes of CLEF}, year={2021} }

Problem Definition: Given the text of a news article, determine whether the main claim made in the article is true, partially true, false, or other (e.g., claims in dispute) and detect the topical domain of the article. This task will run in English and German.

Task 3: Multi-class fake news detection of news articles (English) Sub-task A would detect fake news designed as a four-class classification problem. Given the text of a news article, determine whether the main claim made in the article is true, partially true, false, or other. The training data will be released in batches and roughly about 1264 articles with the respective label in English language. Our definitions for the categories are as follows:

False - The main claim made in an article is untrue.

Partially False - The main claim of an article is a mixture of true and false information. The article contains partially true and partially false information but cannot be considered 100% true. It includes all articles in categories like partially false, partially true, mostly true, miscaptioned, misleading etc., as defined by different fact-checking services.

True - This rating indicates that the primary elements of the main claim are demonstrably true.

Other- An article that cannot be categorised as true, false, or partially false due to a lack of evidence about its claims. This category includes articles in dispute and unproven articles.

Cross-Lingual Task (German)

Along with the multi-class task for the English language, we have introduced a task for low-resourced language. We will provide the data for the test in the German language. The idea of the task is to use the English data and the concept of transfer to build a classification model for the German language.

Input Data

The data will be provided in the format of Id, title, text, rating, the domain; the description of the columns is as follows:

ID- Unique identifier of the news article

Title- Title of the news article

text- Text mentioned inside the news article

our rating - class of the news article as false, partially false, true, other

Output data format

public_id- Unique identifier of the news article

predicted_rating- predicted class

Sample File

public_id, predicted_rating 1, false 2, true

IMPORTANT!

We have used the data from 2010 to 2022, and the content of fake news is mixed up with several topics like elections, COVID-19 etc.

Baseline: For this task, we have created a baseline system. The baseline system can be found at https://zenodo.org/record/6362498

Related Work

Shahi GK. AMUSED: An Annotation Framework of Multi-modal Social Media Data. arXiv preprint arXiv:2010.00502. 2020 Oct 1.https://arxiv.org/pdf/2010.00502.pdf

G. K. Shahi and D. Nandini, “FakeCovid – a multilingual cross-domain fact check news dataset for covid-19,” in workshop Proceedings of the 14th International AAAI Conference on Web and Social Media, 2020. http://workshop-proceedings.icwsm.org/abstract?id=2020_14

Shahi, G. K., Dirkson, A., & Majchrzak, T. A. (2021). An exploratory study of covid-19 misinformation on twitter. Online Social Networks and Media, 22, 100104. doi: 10.1016/j.osnem.2020.100104

Shahi, G. K., Struß, J. M., & Mandl, T. (2021). Overview of the CLEF-2021 CheckThat! lab task 3 on fake news detection. Working Notes of CLEF.

Nakov, P., Da San Martino, G., Elsayed, T., Barrón-Cedeno, A., Míguez, R., Shaar, S., ... & Mandl, T. (2021, March). The CLEF-2021 CheckThat! lab on detecting check-worthy claims, previously fact-checked claims, and fake news. In European Conference on Information Retrieval (pp. 639-649). Springer, Cham.

Nakov, P., Da San Martino, G., Elsayed, T., Barrón-Cedeño, A., Míguez, R., Shaar, S., ... & Kartal, Y. S. (2021, September). Overview of the CLEF–2021 CheckThat! Lab on Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News. In International Conference of the Cross-Language Evaluation Forum for European Languages (pp. 264-291). Springer, Cham.
Ability to recognize false information and news in the U.S. 2023
statista.com
ai-chatbox.pro
Updated Apr 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). Ability to recognize false information and news in the U.S. 2023 [Dataset]. https://www.statista.com/statistics/657090/fake-news-recogition-confidence/
Explore at:
Dataset updated
Apr 16, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Apr 3, 2023 - Apr 9, 2023
Area covered
United States
Description
According a survey held in April 2023, the share of people aged 18 years and above in the United States who were very confident in their ability to distinguish real news from false information amounted to 23 percent. A further 52 percent were somewhat confident that they were able to identify misinformation, whereas just five percent had little faith in themselves to determine facts from fake content.
Fake news traffic sources in the U.S. 2017
statista.com
Updated Feb 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). Fake news traffic sources in the U.S. 2017 [Dataset]. https://www.statista.com/statistics/672275/fake-news-traffic-source/
Explore at:
Dataset updated
Feb 13, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
United States
Description
Perhaps unsurprisingly, the main traffic source for false information online is social media, which generates 42 percent of fake news traffic. The nature of social networks, most notably the ease of sharing content, allows fake news to spread at a rapid rate – an issue further exacerbated by the fact that many U.S. adults sometimes believe fake news to be real.

Fake news: an ongoing problem

The presence of fake news would be less of an issue if users were more aware of how to identify it and were aware of the risks of sharing such content. Many U.S. news consumers have shared fake news online, and worryingly, ten percent did so deliberately. Adults who are part of that ten percent are just a small portion of people in the United States, and elsewhere in the world, who are responsible for spreading false information. More than 30 percent of U.S. children and teenagers have shared a fake news story online, and over 50 percent of adults in selected countries worldwide have wrongly believed a fake news story.

The result of adults and young consumers alike not only believing fake news, but actively sharing it, is that small, illegitimate websites producing such content are able to grow more successful. Such websites have the potential to tarnish or seriously damage the reputation of any persons mentioned within a fake news article, promote events or policies which do not exist, and mislead readers about important topics they are trying to keep up with. A 2019 survey revealed that most adults believe that fake news and misinformation will get worse in the next five years, and the sad truth is that this will likely be the case unless news consumers grow more discerning about what they post and share online.
Encountering fake news in print media worldwide 2019, by country
statista.com
Updated Jun 3, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2022). Encountering fake news in print media worldwide 2019, by country [Dataset]. https://www.statista.com/statistics/1016534/fake-news-print-media-worldwide/
Explore at:
Dataset updated
Jun 3, 2022
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Dec 21, 2018 - Jan 4, 2019
Area covered
Worldwide
Description
The statistic presents the share of adults who have witnessed fake news in print media worldwide as of January 2019, broken down by country. The findings reveal that the majority of responding adults in Turkey said that they had witnessed fake news in print media, with 72 percent having encountered false information in a print publication compared to 18 percent who said they had not. Conversely, just 27 percent of respondents in Pakistan witnessed fake news in print media at some point.
Fake and True News Dataset
figshare.com
txt
Updated Dec 3, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abu Bakkar Siddik (2020). Fake and True News Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.13325198.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.13325198.v1
Dataset updated
Dec 3, 2020
Dataset provided by
Figsharehttp://figshare.com/
Authors
Abu Bakkar Siddik
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In this dataset have to part combined namely fake news and true news. fake news collected from Kaggle and some true news collected form IEEE Data port. Therefor some true news data required to optimize with the fake news. After that i have collect some true news from different trusted online site. Finally i have concat the Fake and True news as a single dataset for the purpose to help the Researchers further if they want to research by taken this topic.
Data from: Real-Fake News Dataset
kaggle.com
Updated Jun 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Akash_Sandhu4x4 (2025). Real-Fake News Dataset [Dataset]. https://www.kaggle.com/datasets/akashsandhu4x4/real-fake-news-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 5, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Akash_Sandhu4x4
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset

This dataset was created by Akash_Sandhu4x4

Released under MIT

Contents
COVID Fake News Dataset
zenodo.org
data.niaid.nih.gov
Updated Nov 27, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sumit Banik; Sumit Banik (2020). COVID Fake News Dataset [Dataset]. http://doi.org/10.5281/zenodo.4282522
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.4282522
Dataset updated
Nov 27, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Sumit Banik; Sumit Banik
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Context

The dataset contains the list of COVID Fake News/Claims which is shared all over the internet.

Content

Headlines: String attribute consisting of the headlines/fact shared.

Outcome: It is binary data where 0 means the headline is fake and 1 means that it is true.

Inspiration

In many research portals, there was this common question in which the combined fake news dataset is available or not. This led to the publication of this dataset.
Perceived prevalence of fake news in media sources worldwide 2019
statista.com
Updated Aug 31, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2021). Perceived prevalence of fake news in media sources worldwide 2019 [Dataset]. https://www.statista.com/statistics/1112026/fake-news-prevalence-attitudes-worldwide/
Explore at:
Dataset updated
Aug 31, 2021
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Jan 25, 2019 - Feb 8, 2019
Area covered
Worldwide
Description
According to a global study conducted in 2019, 62 percent of respondents felt that there was a fair extent or great deal of fake news on online websites and platforms. By comparison, 10 percent less said the same about TV, radio, newspapers, and magazines. Traditional media in general is still considered more trustworthy than online formats, despite social networks being the preferred choice for many.

Meanwhile, as some consumers around the world now turn to influencers for news instead of journalists, the risk of them being exposed to inaccurate, incorrect, or deliberately false information continues to grow, and journalists face pressure to battle fake content whilst finding new ways to keep audiences engaged.

Fake news and journalism

More than 50 percent of journalists responding to a global survey believed that the public had lost trust in the media over the past year. Whilst the reasons for this are many, the role of fake news cannot be undermined, particularly given the speed with which false content can spread and reach vulnerable or misinformed audiences. Either unintentionally or deliberately, fake news is often shared by those who encounter it, which only serves to worsen the problem. Indeed, journalists consider regular citizens to be the main source of disinformation, followed by political leaders and internet trolls.

Despite the threats fake news poses, journalists themselves feel that concerns about disinformation could positively impact the quality of journalism. There are also growing expectations from the public and journalists alike for governments and companies to do more to help boost quality journalism and curb the dissemination and influence of fake news. News industry leaders rated Google as being the best platform for supporting journalism, but the likes of Amazon and Snapchat have a long way to go before organizations consider them reliable in this respect.

Fake and Real News Dataset

kaggle.com

Updated Dec 3, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Gilchrist (2024). Fake and Real News Dataset [Dataset]. https://www.kaggle.com/datasets/gilchr/fake-and-real-news-dataset

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Dec 3, 2024

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Gilchrist

License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

Title: Fake vs Real News Dataset

Description:

This dataset contains news articles classified into two categories: real and fake. It is designed to help researchers, data scientists, and students build and test machine learning models capable of detecting fake news.

Dataset Structure:

Columns:
- title: The title of the news article.
- content: The full content of the news article (raw text).
- target: A binary label indicating the authenticity of the news:
- 0: Real news.
- 1: Fake news.

Objective:

The primary goals of this dataset are to: - Provide a resource for training and evaluating binary classification models. - Enable experiments on Natural Language Processing (NLP), such as text vectorization, sentiment analysis, and more. - Encourage exploration of approaches to identify biases in data related to fake news detection.

Data Sources:

This dataset was created by merging two existing CSV files, representing fake and real news articles respectively. https://www.kaggle.com/datasets/clmentbisaillon/fake-and-real-news-dataset?select=Fake.csv

Sample Data:

title	content	target
NASA announces new Mars rover mission	NASA revealed plans for a new mission to Mars starting in 2025.	0
Vaccines implant 5G chips	Conspiracy theorists claim vaccines are used to implant 5G tracking.	1

Potential Use Cases:

Train classification models to predict the authenticity of news articles.
Test NLP pipelines, such as those based on CountVectorizer, TF-IDF, or advanced models like BERT.
Study trends in fake news: topics, keywords, and linguistic patterns.

Caution:

This dataset is provided for educational and research purposes only.
Model results should be interpreted carefully and not used for critical applications without thorough validation.

b
News Datasets
brightdata.com
.json, .csv, .xlsx
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bright Data, News Datasets [Dataset]. https://brightdata.com/products/datasets/news
Explore at:
.json, .csv, .xlsxAvailable download formats
Dataset authored and provided by
Bright Data
License
https://brightdata.com/licensehttps://brightdata.com/license
Area covered
Worldwide
Description
Stay ahead with our comprehensive News Dataset, designed for businesses, analysts, and researchers to track global events, monitor media trends, and extract valuable insights from news sources worldwide.

Dataset Features

News Articles: Access structured news data, including headlines, summaries, full articles, publication dates, and source details. Ideal for media monitoring and sentiment analysis. Publisher & Source Information: Extract details about news publishers, including domain, region, and credibility indicators. Sentiment & Topic Classification: Analyze news sentiment, categorize articles by topic, and track emerging trends in real time. Historical & Real-Time Data: Retrieve historical archives or access continuously updated news feeds for up-to-date insights.

Customizable Subsets for Specific Needs Our News Dataset is fully customizable, allowing you to filter data based on publication date, region, topic, sentiment, or specific news sources. Whether you need broad coverage for trend analysis or focused data for competitive intelligence, we tailor the dataset to your needs.

Popular Use Cases

Media Monitoring & Reputation Management: Track brand mentions, analyze media coverage, and assess public sentiment. Market & Competitive Intelligence: Monitor industry trends, competitor activity, and emerging market opportunities. AI & Machine Learning Training: Use structured news data to train AI models for sentiment analysis, topic classification, and predictive analytics. Financial & Investment Research: Analyze news impact on stock markets, commodities, and economic indicators. Policy & Risk Analysis: Track regulatory changes, geopolitical events, and crisis developments in real time.

Whether you're analyzing market trends, monitoring brand reputation, or training AI models, our News Dataset provides the structured data you need. Get started today and customize your dataset to fit your business objectives.
Frequency of encountering potentially fake news online India 2023
statista.com
Updated Jun 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). Frequency of encountering potentially fake news online India 2023 [Dataset]. https://www.statista.com/statistics/1406289/india-frequency-of-seeing-fake-news-online/
Explore at:
Dataset updated
Jun 26, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Mar 2023
Area covered
India
Description
According to a digital news consumption survey conducted in India in March 2023, more than 60 percent of the respondents claimed that they sometimes encountered potentially fake news online. In contrast, three percent of the surveyed consumers stated that they never encountered potentially fake news online. In recent years, the number of fake news-related incidents in India has been on the rise.
Indian Fake News
kaggle.com
Updated Aug 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bikram Saha (2022). Indian Fake News [Dataset]. https://www.kaggle.com/datasets/imbikramsaha/fake-real-news
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 1, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Bikram Saha
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
India
Description
news_dataset.csv is a fake new classification dataset.

It contains two columns label and text columns

text columns : news text label columns : FAKE/REAL

Use 20% of the data as test dataset and rest 80% for training.
Z
Fake News Database
data.niaid.nih.gov
explore.openaire.eu
Updated Mar 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Damião, Íris (2024). Fake News Database [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10354244
Explore at:
Dataset updated
Mar 22, 2024
Dataset provided by
Reis, Jose
Gonçalves-Sá, Joana
Damião, Íris
Davidson, Alex
Rijo, Angela
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Curated database of fact checked claims (fake and real news), with close to 70.000 URLs, classified by topic.
c
Fox News dataset is for analyzing media trends and narratives
crawlfeeds.com
csv, zip
Updated May 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Crawl Feeds (2025). Fox News dataset is for analyzing media trends and narratives [Dataset]. https://crawlfeeds.com/datasets/fox-news-dataset
Explore at:
zip, csvAvailable download formats
Dataset updated
May 19, 2025
Dataset authored and provided by
Crawl Feeds
License
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Description
The Fox News Dataset is a comprehensive collection of over 1 million news articles, offering an unparalleled resource for analyzing media narratives, public discourse, and political trends. Covering articles up to the year 2023, this dataset is a treasure trove for researchers, analysts, and businesses interested in gaining deeper insights into the topics and trends covered by Fox News.

Key Features of the Fox News Dataset

Extensive Coverage: Contains more than 1 million articles spanning various topics and events up to 2023.

Research-Ready: Perfect for text classification, natural language processing (NLP), and other research purposes.

Format: Provided in CSV format for seamless integration into analytical and research tools.

Why Use This Dataset?

This large dataset is ideal for:

Text Classification: Develop machine learning models to classify and categorize news content.

Natural Language Processing (NLP): Conduct sentiment analysis, keyword extraction, or topic modeling.

Media and Political Research: Analyze media narratives, public opinion, and political trends reflected in Fox News articles.

Trend Analysis: Identify shifts in public discourse and media focus over time.

Explore More News Datasets

Discover additional resources for your research needs by visiting our news dataset collection. These datasets are tailored to support diverse analytical applications, including sentiment analysis and trend modeling.

The Fox News Dataset is a must-have for anyone interested in exploring large-scale media data and leveraging it for advanced analysis. Ready to dive into this wealth of information? Download the dataset now in CSV format and start uncovering the stories behind the headlines.
Average Number of Fake News Stories Shared on Facebook, by Age Group
evidencehub.net
json
Updated Feb 11, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Guess, Andrew, Jonathan Nagler, Joshua Tucker. Less Than You Think: Prevalence and Predictions of Fake News Dissemination on Facebook (New York: American Association for the Advancement of Science, 2019) (2022). Average Number of Fake News Stories Shared on Facebook, by Age Group [Dataset]. https://evidencehub.net/chart/average-number-of-fake-news-stories-shared-on-facebook-by-age-group-74.0
Explore at:
jsonAvailable download formats
Dataset updated
Feb 11, 2022
Dataset provided by
The Lisbon Council
Authors
Guess, Andrew, Jonathan Nagler, Joshua Tucker. Less Than You Think: Prevalence and Predictions of Fake News Dissemination on Facebook (New York: American Association for the Advancement of Science, 2019)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Measurement technique
Survey (N=5000)
Description
The chart shows that Americans over 65 were more likely to share fake news to their Facebook friends, regardless of their education, ideology, and partisanship. The oldest age group was likely to share nearly seven times as many articles from fake news domains on Facebook as those in the youngest age group, or about 2.3 times as many as those in the next-oldest age group. The data regarding the age group 18-29 and 30-44 are not displayed in the source, therefore the value of data in this chart are approximate, determined with pixel count.
P
MM-COVID Dataset
paperswithcode.com
Updated May 23, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yichuan Li; Bohan Jiang; Kai Shu; Huan Liu (2025). MM-COVID Dataset [Dataset]. https://paperswithcode.com/dataset/mm-covid
Explore at:
Dataset updated
May 23, 2025
Authors
Yichuan Li; Bohan Jiang; Kai Shu; Huan Liu
Description
MM-COVID is a dataset for fake news detection related to COVID-19. This dataset provides the multilingual fake news and the relevant social context. It contains 3,981 pieces of fake news content and 7,192 trustworthy information from English, Spanish, Portuguese, Hindi, French and Italian, 6 different languages.
c
BBC News Dataset – February 2023 Edition
crawlfeeds.com
csv, zip
Updated Jun 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Crawl Feeds (2025). BBC News Dataset – February 2023 Edition [Dataset]. https://crawlfeeds.com/datasets/bbc-news-dataset-feb-2023
Explore at:
zip, csvAvailable download formats
Dataset updated
Jun 14, 2025
Dataset authored and provided by
Crawl Feeds
License
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Description
Get access to a comprehensive and structured dataset of BBC News articles, freshly crawled and compiled in February 2023. This collection includes 1 million records from one of the world’s most trusted news organizations — perfect for training NLP models, sentiment analysis, and trend detection across global topics.

💾 Format: CSV (available in ZIP archive)

📢 Status: Published and available for immediate access

Use Cases

Train language models to summarize or categorize news

Detect media bias and compare narrative framing

Conduct research in journalism, politics, and public sentiment

Enrich news aggregation platforms with clean metadata

Analyze content distribution across categories (e.g. health, politics, tech)

This dataset ensures reliable and high-quality information sourced from a globally respected outlet. The format is optimized for quick ingestion into your pipelines — with clean text, timestamps, image links, and more.

Need a filtered dataset or want this refreshed for a later date? We offer on-demand news scraping as well.

👉 Request access or sample now

Facebook

Twitter

Click to copy link

Link copied

Cite

Amy Watson (2024). Leading social networks used for news in the U.S. 2019-2025 [Dataset]. https://www.statista.com/topics/3251/fake-news/

Leading social networks used for news in the U.S. 2019-2025

Explore at:

Dataset updated

Jan 9, 2024

Dataset provided by

Statistahttp://statista.com/

Authors

Amy Watson

Description

In 2025, Facebook remained the most-used social platform for news in the United States, with 32 percent of respondents reporting they accessed news on it. YouTube followed closely at 30 percent, recording a slight increase from the previous year. X (formerly Twitter) saw the most notable growth, rising by eight percent to 23 percent.

Clear search

Close search

Google apps

Main menu

Leading social networks used for news in the U.S. 2019-2025

Fake News Statistics By Impacts, AI, Country, Misinformation, Frequency,...

Introduction

CT-FAN-21 corpus: A dataset for Fake News Detection

CT-FAN: A Multilingual dataset for Fake News Detection

Ability to recognize false information and news in the U.S. 2023

Fake news traffic sources in the U.S. 2017

Encountering fake news in print media worldwide 2019, by country

Fake and True News Dataset

Data from: Real-Fake News Dataset

Dataset

Contents

COVID Fake News Dataset

Perceived prevalence of fake news in media sources worldwide 2019

Fake and Real News Dataset

Title: Fake vs Real News Dataset

Description:

Dataset Structure:

Objective:

Data Sources:

Sample Data:

Potential Use Cases:

Caution:

News Datasets

Frequency of encountering potentially fake news online India 2023

Indian Fake News

news_dataset.csv is a fake new classification dataset.

Fake News Database

Fox News dataset is for analyzing media trends and narratives

Key Features of the Fox News Dataset

Why Use This Dataset?

Explore More News Datasets

Average Number of Fake News Stories Shared on Facebook, by Age Group

MM-COVID Dataset

BBC News Dataset – February 2023 Edition

Use Cases

Leading social networks used for news in the U.S. 2019-2025

`news_dataset.csv` is a fake new classification dataset.