4 datasets found
  1. Data from: ClaimsKG - A Knowledge Graph of Fact-Checked Claims

    • zenodo.org
    • explore.openaire.eu
    • +1more
    zip
    Updated Oct 18, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andon Tchechmedjiev; Andon Tchechmedjiev; Pavlos Fafalios; Pavlos Fafalios; Konstantin Todorov; Konstantin Todorov; Stefan Dietze; Boland; Zapilko; Stefan Dietze; Boland; Zapilko (2022). ClaimsKG - A Knowledge Graph of Fact-Checked Claims [Dataset]. http://doi.org/10.5281/zenodo.3518960
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 18, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Andon Tchechmedjiev; Andon Tchechmedjiev; Pavlos Fafalios; Pavlos Fafalios; Konstantin Todorov; Konstantin Todorov; Stefan Dietze; Boland; Zapilko; Stefan Dietze; Boland; Zapilko
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    The latest release of ClaimsKG is available in Datorium.

    ClaimsKG is a knowledge graph of metadata information for thousands of fact-checked claims which facilitates structured queries about their truth values, authors, dates, and other kinds of metadata. ClaimsKG is generated through a (semi-)automated pipeline, which harvests claim-related data from popular fact-checking web sites, annotates them with related entities from DBpedia, and lifts all data to RDF using an RDF/S model that makes use of established vocabularies (such as schema.org).

    ClaimsKG does NOT contain the text of the reviews from the fact-checking web sites; it only contains structured metadata information and links to the reviews.

    More information, such as statistics, query examples and a user friendly interface to explore the knowledge graph, is available at: https://data.gesis.org/claimskg/site

    If you use ClaimsKG, please cite the below paper:

    Tchechmedjiev, Andon, Pavlos Fafalios, Katarina Boland, Malo Gasquet, Matthäus Zloch, Benjamin Zapilko, Stefan Dietze, and Konstantin Todorov. "ClaimsKG: a Knowledge Graph of Fact-Checked Claims." In International Semantic Web Conference, pp. 309-324. Springer, Cham, 2019. https://doi.org/10.1007/978-3-030-30796-7_20
    [pdf, bib]

  2. g

    ClaimsKG - A Knowledge Graph of Fact-Checked Claims (January, 2023)

    • search.gesis.org
    • pollux-fid.de
    Updated Sep 27, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gangopadhyay, Susmita; Schellhammer, Sebastian; Boland, Katarina; Schüller, Sascha; Todorov, Konstantin; Tchechmedjiev, Andon; Zapilko, Benjamin; Fafalios, Pavlos; Jabeen, Hajira; Dietze, Stefan (2023). ClaimsKG - A Knowledge Graph of Fact-Checked Claims (January, 2023) [Dataset]. http://doi.org/10.7802/2620
    Explore at:
    Dataset updated
    Sep 27, 2023
    Dataset provided by
    GESIS, Köln
    GESIS search
    Authors
    Gangopadhyay, Susmita; Schellhammer, Sebastian; Boland, Katarina; Schüller, Sascha; Todorov, Konstantin; Tchechmedjiev, Andon; Zapilko, Benjamin; Fafalios, Pavlos; Jabeen, Hajira; Dietze, Stefan
    License

    https://www.gesis.org/en/institute/data-usage-termshttps://www.gesis.org/en/institute/data-usage-terms

    Description

    ClaimsKG is a knowledge graph of metadata information for fact-checked claims scraped from popular fact-checking sites. In addition to providing a single dataset of claims and associated metadata, truth ratings are harmonized and additional information is provided for each claim, e.g., about mentioned entities. Please see (https://data.gesis.org/claimskg/) for further details about the data model, query examples and statistics.

    The dataset facilitates structured queries about claims, their truth values, involved entities, authors, dates, and other kinds of metadata. ClaimsKG is generated through a (semi-)automated pipeline, which harvests claim-related data from popular fact-checking web sites, annotates them with related entities from DBpedia/Wikipedia, and lifts all data to RDF using established vocabularies (such as schema.org).

    The latest release of ClaimsKG covers 74066 claims and 72127 Claim Reviews. This is the fourth release of the dataset where data was scraped till Jan 31, 2023 containing claims published between 1996 and 2023 from 13 fact-checking websites. The websites are Fullfact, Politifact, TruthOrFiction, Checkyourfact, Vishvanews, AFP (French), AFP, Polygraph, EU factcheck, Factograph, Fatabyyano, Snopes and Africacheck. The claim-review (fact-checking) period for claims ranges between the year 1996 to 2023. Similar to the previous release, the Entity fishing python client (https://github.com/hirmeos/entity-fishing-client-python) has been used for entity linking and disambiguation in this release. Improvements have been made in the web scraping and data preprocessing pipeline to extract more entities from both claims and claims reviews. Currently, ClaimsKG contains 3408386 entities detected and referenced with DBpedia.

    This latest release of ClaimsKG supersedes the previous versions as it contained all the claims from the previous versions together in addition to the additional new claims as well as improved entity annotation resulting in a higher number of entities.

  3. ARAFA: An LLM Generated Arabic Fact Checking Dataset

    • zenodo.org
    json
    Updated Aug 12, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christophe Khalil; Christophe Khalil (2025). ARAFA: An LLM Generated Arabic Fact Checking Dataset [Dataset]. http://doi.org/10.5281/zenodo.16762969
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Aug 12, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Christophe Khalil; Christophe Khalil
    Description

    Automatic fact-checking poses a significant challenge in Arabic natural language processing due to the scarcity of datasets and resources. In this manuscript, we
    introduce ARAFA, a new large-scale dataset for fact-checking in Modern Standard Arabic, constructed through an automated framework leveraging large language
    models (LLMs). The dataset was constructed through a three-step pipeline: (1) claim generation from Arabic Wikipedia pages with supporting textual evidence,
    (2) claim mutation to generate challenging counterfactual claims with refuting evidence, and (3) an automatic validation step to validate that the generated claims
    are either supported or refuted by their accompanying evidence, or if the evidence does not provide enough information to judge the validity of the claims. The
    resulting dataset comprises 181,976 claim-evidence pairs labeled as supported, refuted, or not enough information. Human evaluation carried out on a test sample from the dataset demonstrated strong inter-annotator agreement (κ = 0.89) using Cohen’s Kappa for supported claims and (κ = 0.94) for refuted claims.
    Automatic validation based on human-evaluated sample achieved 86% accuracy for supported claims and 88% for refuted ones. To showcase ARAFA’s value as a
    resource for automatic Arabic fact-checking, four open-source transformer-based models were fine-tuned using ARAFA, with the top-performing model achieving a
    Macro F1-score of 77% on the test data. In addition to ARAFA being the first large-scale dataset for Arabic fact-checking, our framework presents a scalable
    approach for developing similar resources for other low-resource languages.

  4. h

    qa2d-cs

    • huggingface.co
    Updated Jan 15, 2007
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AI Center FEE CTU (2007). qa2d-cs [Dataset]. https://huggingface.co/datasets/ctu-aic/qa2d-cs
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 15, 2007
    Dataset authored and provided by
    AI Center FEE CTU
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Czech version of the Question to Declarative Sentence (QA2D). Machine translated using DeepL service. For more information, see our Pipeline and Dataset Generation for Automated Fact-checking in Almost Any Language paper. Currently in review for NCAA journal. @article{drchal2023pipeline, title={Pipeline and Dataset Generation for Automated Fact-checking in Almost Any Language}, author={Drchal, Jan and Ullrich, Herbert and Mlyn{\'a}{\v{r}}, Tom{\'a}{\v{s}} and Moravec, V{\'a}clav}… See the full description on the dataset page: https://huggingface.co/datasets/ctu-aic/qa2d-cs.

  5. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Andon Tchechmedjiev; Andon Tchechmedjiev; Pavlos Fafalios; Pavlos Fafalios; Konstantin Todorov; Konstantin Todorov; Stefan Dietze; Boland; Zapilko; Stefan Dietze; Boland; Zapilko (2022). ClaimsKG - A Knowledge Graph of Fact-Checked Claims [Dataset]. http://doi.org/10.5281/zenodo.3518960
Organization logo

Data from: ClaimsKG - A Knowledge Graph of Fact-Checked Claims

Related Article
Explore at:
zipAvailable download formats
Dataset updated
Oct 18, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Andon Tchechmedjiev; Andon Tchechmedjiev; Pavlos Fafalios; Pavlos Fafalios; Konstantin Todorov; Konstantin Todorov; Stefan Dietze; Boland; Zapilko; Stefan Dietze; Boland; Zapilko
License

Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically

Description

The latest release of ClaimsKG is available in Datorium.

ClaimsKG is a knowledge graph of metadata information for thousands of fact-checked claims which facilitates structured queries about their truth values, authors, dates, and other kinds of metadata. ClaimsKG is generated through a (semi-)automated pipeline, which harvests claim-related data from popular fact-checking web sites, annotates them with related entities from DBpedia, and lifts all data to RDF using an RDF/S model that makes use of established vocabularies (such as schema.org).

ClaimsKG does NOT contain the text of the reviews from the fact-checking web sites; it only contains structured metadata information and links to the reviews.

More information, such as statistics, query examples and a user friendly interface to explore the knowledge graph, is available at: https://data.gesis.org/claimskg/site

If you use ClaimsKG, please cite the below paper:

Tchechmedjiev, Andon, Pavlos Fafalios, Katarina Boland, Malo Gasquet, Matthäus Zloch, Benjamin Zapilko, Stefan Dietze, and Konstantin Todorov. "ClaimsKG: a Knowledge Graph of Fact-Checked Claims." In International Semantic Web Conference, pp. 309-324. Springer, Cham, 2019. https://doi.org/10.1007/978-3-030-30796-7_20
[pdf, bib]

Search
Clear search
Close search
Google apps
Main menu