MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Card for "x-fact"
Dataset Description
Dataset Summary
X-FACT is a multilingual dataset for fact-checking with real world claims. The dataset contains short statments in 25 languages with top five evidence documents retrieved by performing google search with claim statements. The dataset contains two additional evaluation splits (in addition to a traditional test set): ood and zeroshot. ood measures out-of-domain generalization where while the language… See the full description on the dataset page: https://huggingface.co/datasets/utahnlp/x-fact.
The results of a survey held in summer 2022 found that the main way to verify news was using Google or another search engine. Fact-checking by reading further on the topic or finding information from experts were also popular strategies, especially for news found on social media.
According to the most recently available data, around ********* of Americans feel very confident in their ability to check the accuracy of news stories regarding coronavirus. In an online survey conducted in **********, ** percent of respondents stated they would know how to confirm the accuracy of news and information regarding the COVID-19 pandemic. The majority of participants expressed a moderate level of self confidence in their capacity to fact check, with ** percent somewhat confident.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Fin-Fact - Financial Fact-Checking Dataset
Overview
Welcome to the Fin-Fact repository! Fin-Fact is a comprehensive dataset designed specifically for financial fact-checking and explanation generation. This README provides an overview of the dataset, how to use it, and other relevant information. Click here to access the paper.
Dataset Usage
Fin-Fact is a valuable resource for researchers, data scientists, and fact-checkers in the financial domain. Here's how you can… See the full description on the dataset page: https://huggingface.co/datasets/amanrangapur/Fin-Fact.
We conduct a content analysis of fact-check articles published by six major fact-checkers from the UK and US. Our analysis builds on existing content analyses of fact-checking content and empirical studies of the epistemic practices of fact-checkers by focusing on the claim types checked, issue identified, arguments advanced, and verdicts reached by the fact-checkers in our corpus. We find that the fact-checkers in our corpus predominantly check claim types and content that can in theory be verified, but that they occasionally check claim types that cannot be factually verified. We also find a great diversity in the issues identified with claims and the arguments advanced to substantiate assessments. Some of these issues and arguments are consistent with a ‘verification model’ of fact-checking, whereas others are more consistent with distinct epistemic approaches to fact-checking that we term ‘argumentative’ and ‘interpretivist’. Decisive false verdicts are the most common verdict reached in our corpus, and they are regularly reached for some claim types that are not factually verifiable. Our study contributes to debates about the epistemology of fact-checking by producing evidence on the extent to which fact-checkers check non-verifiable claims and on the varied types of epistemic work undertaken by fact-checkers.
A survey from July 2022 asked Americans how they felt about the effects of bias in news on their ability to sort out facts, and revealed that 50 percent felt there was so much bias in the news that it was difficult to discern what was factual from information that was not. This was the highest share who said so across all years shown, and at the same time, the 2022 survey showed the lowest share of respondents who believed there were enough sources to be able to sort out fact from fiction.
The US Marshals Service web site.
persuasion-scaling-laws/fact-checking dataset hosted on Hugging Face and contributed by the HF Datasets community
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Card for FactCheck
📝 Dataset Summary
FactCheck is an benchmark for evaluating LLMs on knowledge graph fact verification. It combines structured facts from YAGO, DBpedia, and FactBench with web-extracted evidence including questions, summaries, full text, and metadata. The dataset contains examples designed for sentence-level fact-checking and QA tasks.
📚 Supported Tasks
Question Answering: Answer fact-checking questions derived from KG triples.… See the full description on the dataset page: https://huggingface.co/datasets/FactCheck-AI/FactCheck.
Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
GammaCorpus: Fact QA 450k
What is it?
GammaCorpus Fact QA 450k is a dataset that consists of 450,000 fact-based question-and-answer pairs designed for training AI models on factual knowledge retrieval and question-answering tasks.
Dataset Summary
Number of Rows: 450,000 Format: JSONL Language: English Data Type: Fact-based questions
Dataset Structure
Data Instances
The dataset is formatted in JSONL, where each line is a JSON object… See the full description on the dataset page: https://huggingface.co/datasets/rubenroy/GammaCorpus-Fact-QA-450k.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
State fact sheets provide information on population, income, education, employment, federal funds, organic agriculture, farm characteristics, farm financial indicators, top commodities, and exports, for each State in the United States. Links to county-level data are included when available.This record was taken from the USDA Enterprise Data Inventory that feeds into the https://data.gov catalog. Data for this record includes the following resources: Query tool For complete information, please visit https://data.gov.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Does external monitoring improve democratic performance? Fact-checking has come to play an increasingly important role in political coverage in the United States, but some research suggests it may be ineffective at reducing public misperceptions about controversial issues. However, fact-checking might instead help improve political discourse by increasing the reputational costs or risks of spreading misinformation for political elites. To evaluate this deterrent hypothesis, we conducted a field experiment on a diverse group of state legislators from nine U.S. states in the months before the November 2012 election. In the experiment, a randomly assigned subset of state legislators were sent a series of letters about the risks to their reputation and electoral security if they are caught making questionable statements. The legislators who were sent these letters were substantially less likely to receive a negative fact-checking rating or to have their accuracy questioned publicly, suggesting that fact-checking can reduce inaccuracy when it poses a salient threat.
Population and other demographic information is collected by the US Census Bureau.
View the US Census Bureau's Quick Facts page about Bloomington, Indiana at https://www.census.gov/quickfacts
The Demographic Profile and other data for Bloomington can be viewed or downloaded from the American FactFinder search tool: https://factfinder.census.gov/bkmk/cf/1.0/en/place/Bloomington city, Indiana/POPULATION/DECENNIAL_CNT
The Census Bureau is creating a new platform for data. This site is in a preview stage and some parts are under construction. Here is a link for Bloomington: https://data.census.gov/cedsci/results/all?q=Bloomington%20city,%20Indiana&g=1600000US1805860&ps=app*from@SINGLE_SEARCH
The City webpage for Census data contains other related information: https://bloomington.in.gov/about/census-data
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The data in this map service is updated every weekend.Note: This data includes all activities regardless of whether there is a spatial feature attached.Note: This is a large dataset. Metadata and Downloads are available at: https://data.fs.usda.gov/geodata/edw/datasets.php?xmlKeyword=FACTS+common+attributesTo download FACTS activities layers, search for the activity types you want, such as timber harvest or hazardous fuels treatments. The Forest Service's Natural Resource Manager (NRM) Forest Activity Tracking System (FACTS) is the agency standard for managing information about activities related to fire/fuels, silviculture, and invasive species. This feature class contains the FACTS attributes most commonly needed to describe FACTS activities.This record was taken from the USDA Enterprise Data Inventory that feeds into the https://data.gov catalog. Data for this record includes the following resources: ISO-19139 metadata ArcGIS Hub Dataset ArcGIS GeoService CSV Shapefile GeoJSON KML https://apps.fs.usda.gov/arcx/rest/services/EDW/EDW_ActivityFactsCommonAttributes_01/MapServer/0 Geodatabase Download Shapefile Download For complete information, please visit https://data.gov.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
U.S. Census Bureau QuickFacts statistics for Louisiana. QuickFacts data are derived from: Population Estimates, American Community Survey, Census of Population and Housing, Current Population Survey, Small Area Health Insurance Estimates, Small Area Income and Poverty Estimates, State and County Housing Unit Estimates, County Business Patterns, Nonemployer Statistics, Economic Census, Survey of Business Owners, Building Permits.
This EnviroAtlas dataset estimates population by 12-digit HUC. It is based on the EnviroAtlas dasymetric dataset, which intelligently reallocates 2010 population from census blocks to 30 meter pixels based on land cover and slope. The dasymetric data was aggregated by HUC_12 boundary to summarize population by watershed. This dataset was produced by the US EPA to support research and online mapping activities related to EnviroAtlas. EnviroAtlas (https://www.epa.gov/enviroatlas) allows the user to interact with a web-based, easy-to-use, mapping application to view and analyze multiple ecosystem services for the contiguous United States. The dataset is available as downloadable data (https://edg.epa.gov/data/Public/ORD/EnviroAtlas) or as an EnviroAtlas map service. Additional descriptive information about each attribute in this dataset can be found in its associated EnviroAtlas Fact Sheet (https://www.epa.gov/enviroatlas/enviroatlas-fact-sheets).
This EnviroAtlas dataset intelligently reallocates 2010 population from census blocks to 30 meter pixels based on land cover and slope. This dataset was produced by the US EPA to support research and online mapping activities related to EnviroAtlas. EnviroAtlas (https://www.epa.gov/enviroatlas) allows the user to interact with a web-based, easy-to-use, mapping application to view and analyze multiple ecosystem services for the contiguous United States. The dataset is available as downloadable data (https://edg.epa.gov/data/Public/ORD/EnviroAtlas) or as an EnviroAtlas map service. Additional descriptive information about each attribute in this dataset can be found in its associated EnviroAtlas Fact Sheet (https://www.epa.gov/enviroatlas/enviroatlas-fact-sheets).
The "https://faithcommunitiestoday.org/research-projects-findings/fact-2010/" Target="_blank">Faith Communities Today 2010 national survey brings together 26 individual surveys of congregations. Twenty-four were conducted by or for partner denominations and faith groups, representing 32 of the country's largest denominations and traditions. The common core questionnaire of the survey replicates more than 150 questions from the 2000, 2005 and 2008 surveys, plus a special section on the 2008 recession. This dataset contains the FACT 2010 data from the Presbyterian Church (USA).
Only three percent of the nearly 92,000 existing dams in the United States had hydroelectric facilities in 2024. Most of those dams were regulated by the State, with only around five percent regulated by the federal government. Texas is home to the greatest number of dams in the U.S.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Card for "x-fact"
Dataset Description
Dataset Summary
X-FACT is a multilingual dataset for fact-checking with real world claims. The dataset contains short statments in 25 languages with top five evidence documents retrieved by performing google search with claim statements. The dataset contains two additional evaluation splits (in addition to a traditional test set): ood and zeroshot. ood measures out-of-domain generalization where while the language… See the full description on the dataset page: https://huggingface.co/datasets/utahnlp/x-fact.