25 datasets found

h
SimpleQA
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
basicv8vc, SimpleQA [Dataset]. https://huggingface.co/datasets/basicv8vc/SimpleQA
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Authors
basicv8vc
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
SimpleQA

A factuality benchmark called SimpleQA that measures the ability for language models to answer short, fact-seeking questions.

Sources

openai/simple-evals Introducing SimpleQA Measuring short-form factuality in large language models
Share of questions answered by AI models in SimpleQA benchmark 2025
statista.com
Updated May 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Share of questions answered by AI models in SimpleQA benchmark 2025 [Dataset]. https://www.statista.com/statistics/1612496/ai-simpleqa-share-of-questions-answered/
Explore at:
Dataset updated
May 30, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2024
Area covered
Worldwide
Description
OpenAI's o1 had the highest share of questions answered when attempted in SimpleQA benchmark in 2025. Claude-3 had the highest share of simply not attempting questions, though whether this is due to lack of data or other reasons is unknown.
h
simpleQA
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Oid Labs, simpleQA [Dataset]. https://huggingface.co/datasets/oidlabs/simpleQA
Explore at:
Dataset authored and provided by
Oid Labs
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
oidlabs/simpleQA dataset hosted on Hugging Face and contributed by the HF Datasets community
h
simpleqa
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
China Merchants Research Institute Of Advanced Technology, simpleqa [Dataset]. https://huggingface.co/datasets/cmriat/simpleqa
Explore at:
Dataset authored and provided by
China Merchants Research Institute Of Advanced Technology
Description
cmriat/simpleqa dataset hosted on Hugging Face and contributed by the HF Datasets community
h
simpleqa
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Llama Stack, simpleqa [Dataset]. https://huggingface.co/datasets/llamastack/simpleqa
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset authored and provided by
Llama Stack
Description
llamastack/simpleqa dataset hosted on Hugging Face and contributed by the HF Datasets community
P
SimpleQuestions Dataset
paperswithcode.com
Updated Aug 14, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Antoine Bordes; Nicolas Usunier; Sumit Chopra; Jason Weston (2021). SimpleQuestions Dataset [Dataset]. https://paperswithcode.com/dataset/simplequestions
Explore at:
Dataset updated
Aug 14, 2021
Authors
Antoine Bordes; Nicolas Usunier; Sumit Chopra; Jason Weston
Description
SimpleQuestions is a large-scale factoid question answering dataset. It consists of 108,442 natural language questions, each paired with a corresponding fact from Freebase knowledge base. Each fact is a triple (subject, relation, object) and the answer to the question is always the object. The dataset is divided into training, validation, and test sets with 75,910, 10,845 and 21,687 questions respectively.
d
SimpleQA 大模型评测基准排行榜
datalearner.com
Updated Oct 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
数据学习 (DataLearner) (2024). SimpleQA 大模型评测基准排行榜 [Dataset]. https://www.datalearner.com/ai-models/llm-benchmark-tests/33
Explore at:
Dataset updated
Oct 15, 2024
Dataset authored and provided by
数据学习 (DataLearner)
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
基于 SimpleQA 基准的最新大语言模型（LLM）性能排行榜，包含各模型的得分、发布机构、发布时间等数据。
h
SimpleQA-RLVR-noprompt
huggingface.co
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hamish Ivison, SimpleQA-RLVR-noprompt [Dataset]. https://huggingface.co/datasets/hamishivi/SimpleQA-RLVR-noprompt
Explore at:
Authors
Hamish Ivison
Description
hamishivi/SimpleQA-RLVR-noprompt dataset hosted on Hugging Face and contributed by the HF Datasets community
h
SimpleQA-1000
huggingface.co
Updated Mar 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andrey Galichin (2025). SimpleQA-1000 [Dataset]. https://huggingface.co/datasets/andreuka18/SimpleQA-1000
Explore at:
Dataset updated
Mar 23, 2025
Authors
Andrey Galichin
Description
andreuka18/SimpleQA-1000 dataset hosted on Hugging Face and contributed by the HF Datasets community
h
output_Llama-3.1-8B-simpleqa-0_1000-m_generation-n_128-t_1.0-k_50-p_0.95-l_128...
huggingface.co
Updated Dec 25, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sen Yang (2024). output_Llama-3.1-8B-simpleqa-0_1000-m_generation-n_128-t_1.0-k_50-p_0.95-l_128 [Dataset]. https://huggingface.co/datasets/ringos/output_Llama-3.1-8B-simpleqa-0_1000-m_generation-n_128-t_1.0-k_50-p_0.95-l_128
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 25, 2024
Authors
Sen Yang
Description
ringos/output_Llama-3.1-8B-simpleqa-0_1000-m_generation-n_128-t_1.0-k_50-p_0.95-l_128 dataset hosted on Hugging Face and contributed by the HF Datasets community
h
simpleqa-llama3.1-8b-inst-safe-rlhf-0628-completions
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Noah Shen, simpleqa-llama3.1-8b-inst-safe-rlhf-0628-completions [Dataset]. https://huggingface.co/datasets/NoahShen/simpleqa-llama3.1-8b-inst-safe-rlhf-0628-completions
Explore at:
Authors
Noah Shen
Description
NoahShen/simpleqa-llama3.1-8b-inst-safe-rlhf-0628-completions dataset hosted on Hugging Face and contributed by the HF Datasets community
h
simple_questions_v2
huggingface.co
Updated May 24, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fethi BOUGARES (2024). simple_questions_v2 [Dataset]. https://huggingface.co/datasets/fbougares/simple_questions_v2
Explore at:
Dataset updated
May 24, 2024
Authors
Fethi BOUGARES
License
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Description
SimpleQuestions is a dataset for simple QA, which consists of a total of 108,442 questions written in natural language by human English-speaking annotators each paired with a corresponding fact, formatted as (subject, relationship, object), that provides the answer but also a complete explanation. Fast have been extracted from the Knowledge Base Freebase (freebase.com). We randomly shuffle these questions and use 70% of them (75910) as training set, 10% as validation set (10845), and the remaining 20% as test set.
h
output_Mistral-Nemo-Base-2407-simpleqa-0_1000-m_generation-n_32-t_1.0-k_40-p_0.9-l_128...
huggingface.co
Updated Feb 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sen Yang (2025). output_Mistral-Nemo-Base-2407-simpleqa-0_1000-m_generation-n_32-t_1.0-k_40-p_0.9-l_128 [Dataset]. https://huggingface.co/datasets/ringos/output_Mistral-Nemo-Base-2407-simpleqa-0_1000-m_generation-n_32-t_1.0-k_40-p_0.9-l_128
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 19, 2025
Authors
Sen Yang
Description
ringos/output_Mistral-Nemo-Base-2407-simpleqa-0_1000-m_generation-n_32-t_1.0-k_40-p_0.9-l_128 dataset hosted on Hugging Face and contributed by the HF Datasets community
h
SimpleQA-synthetic-datastore-Llama3.3-70B-Instruct
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rulin Shao, SimpleQA-synthetic-datastore-Llama3.3-70B-Instruct [Dataset]. https://huggingface.co/datasets/rulins/SimpleQA-synthetic-datastore-Llama3.3-70B-Instruct
Explore at:
Authors
Rulin Shao
Description
Synthetic oracle datastore for SimpleQA. The oracle document is generated based on the problem and answer. This data is generated by Llama3.3-70B-Instruct. template = f""" You are a helpful assistant that can synthesize a Wikipedia document from a question and an answer. The document should be an actual Wikipedia article that can be helpful for answering the question. Do not directly include the question in the document. The document should contain around 150 words.

Question: {{question}} … See the full description on the dataset page: https://huggingface.co/datasets/rulins/SimpleQA-synthetic-datastore-Llama3.3-70B-Instruct.
h
ACG-SimpleQA
huggingface.co
Updated Apr 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Papersnake (2025). ACG-SimpleQA [Dataset]. https://huggingface.co/datasets/Papersnake/ACG-SimpleQA
Explore at:
Dataset updated
Apr 24, 2025
Authors
Papersnake
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
ACG-SimpleQA

🌐 Website • 🤗 Hugging Face

中文 | English

ACG-SimpleQA is an objective knowledge question-answering dataset focused on the Chinese ACG (Animation, Comic, Game) domain, containing 4242 auto-generated carefully designed QA samples. This benchmark aims to evaluate large language models' factual capabilities in the ACG culture domain, featuring Chinese language, diversity, high quality, static answers, and easy evaluation.

📢 Latest Updates… See the full description on the dataset page: https://huggingface.co/datasets/Papersnake/ACG-SimpleQA.
h
openai_simple_qa_test_set
huggingface.co
Updated Oct 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MAISA AI (2024). openai_simple_qa_test_set [Dataset]. https://huggingface.co/datasets/MAISAAI/openai_simple_qa_test_set
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 30, 2024
Dataset provided by
Maisa Inc.
Authors
MAISA AI
Description
Model Card: SimpleQA Benchmark

Information from OpenAI blogpost Model Card for SimpleQAVersion: v1.0Date: October 30, 2024Authors: Jason Wei, Karina Nguyen, Hyung Won Chung, Joy Jiao, Spencer Papay, Mia Glaese, John Schulman, Liam FedusAcknowledgements: Adam Tauman Kalai

Model Overview

SimpleQA is a factuality benchmark designed to evaluate the accuracy and reliability of language models in responding to short, fact-seeking questions. Aimed at assessing models'… See the full description on the dataset page: https://huggingface.co/datasets/MAISAAI/openai_simple_qa_test_set.
h
synthetic-rag-simple-qa-4th-to-6th
huggingface.co
Updated May 30, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lambent (2024). synthetic-rag-simple-qa-4th-to-6th [Dataset]. https://huggingface.co/datasets/Lambent/synthetic-rag-simple-qa-4th-to-6th
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 30, 2024
Authors
Lambent
Description
Lambent/synthetic-rag-simple-qa-4th-to-6th dataset hosted on Hugging Face and contributed by the HF Datasets community
h
simple-qa
huggingface.co
Updated Apr 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Piotr Rybak (2023). simple-qa [Dataset]. https://huggingface.co/datasets/piotr-rybak/simple-qa
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 20, 2023
Authors
Piotr Rybak
Description
piotr-rybak/simple-qa dataset hosted on Hugging Face and contributed by the HF Datasets community
h
Wikipedia-Turkish-SimpleQA
huggingface.co
Updated May 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Murat Tut (2025). Wikipedia-Turkish-SimpleQA [Dataset]. https://huggingface.co/datasets/kesitt/Wikipedia-Turkish-SimpleQA
Explore at:
Dataset updated
May 28, 2025
Authors
Murat Tut
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Area covered
Türkiye
Description
kesitt/Wikipedia-Turkish-SimpleQA dataset hosted on Hugging Face and contributed by the HF Datasets community
h
synthetic-rag-hermes-simple-qa-1st-ic
huggingface.co
Updated Jun 16, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lambent (2024). synthetic-rag-hermes-simple-qa-1st-ic [Dataset]. https://huggingface.co/datasets/Lambent/synthetic-rag-hermes-simple-qa-1st-ic
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 16, 2024
Authors
Lambent
Description
Lambent/synthetic-rag-hermes-simple-qa-1st-ic dataset hosted on Hugging Face and contributed by the HF Datasets community

Facebook

Twitter

Click to copy link

Link copied

Cite

basicv8vc, SimpleQA [Dataset]. https://huggingface.co/datasets/basicv8vc/SimpleQA

SimpleQA

d

basicv8vc/SimpleQA

Explore at:

150 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Authors

basicv8vc

License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

SimpleQA

A factuality benchmark called SimpleQA that measures the ability for language models to answer short, fact-seeking questions.

  Sources

openai/simple-evals Introducing SimpleQA Measuring short-form factuality in large language models

Clear search

Close search

Google apps

Main menu

SimpleQA

Share of questions answered by AI models in SimpleQA benchmark 2025

simpleQA

simpleqa

simpleqa

SimpleQuestions Dataset

SimpleQA 大模型评测基准排行榜

SimpleQA-RLVR-noprompt

SimpleQA-1000

output_Llama-3.1-8B-simpleqa-0_1000-m_generation-n_128-t_1.0-k_50-p_0.95-l_128...

simpleqa-llama3.1-8b-inst-safe-rlhf-0628-completions

simple_questions_v2

output_Mistral-Nemo-Base-2407-simpleqa-0_1000-m_generation-n_32-t_1.0-k_40-p_0.9-l_128...

SimpleQA-synthetic-datastore-Llama3.3-70B-Instruct

ACG-SimpleQA

openai_simple_qa_test_set

synthetic-rag-simple-qa-4th-to-6th

simple-qa

Wikipedia-Turkish-SimpleQA

synthetic-rag-hermes-simple-qa-1st-ic

SimpleQA

d

basicv8vc/SimpleQA