25 datasets found
  1. h

    SimpleQA

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    basicv8vc, SimpleQA [Dataset]. https://huggingface.co/datasets/basicv8vc/SimpleQA
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    basicv8vc
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    SimpleQA

    A factuality benchmark called SimpleQA that measures the ability for language models to answer short, fact-seeking questions.

      Sources
    

    openai/simple-evals Introducing SimpleQA Measuring short-form factuality in large language models

  2. Share of questions answered by AI models in SimpleQA benchmark 2025

    • statista.com
    Updated May 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Share of questions answered by AI models in SimpleQA benchmark 2025 [Dataset]. https://www.statista.com/statistics/1612496/ai-simpleqa-share-of-questions-answered/
    Explore at:
    Dataset updated
    May 30, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2024
    Area covered
    Worldwide
    Description

    OpenAI's o1 had the highest share of questions answered when attempted in SimpleQA benchmark in 2025. Claude-3 had the highest share of simply not attempting questions, though whether this is due to lack of data or other reasons is unknown.

  3. h

    simpleQA

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Oid Labs, simpleQA [Dataset]. https://huggingface.co/datasets/oidlabs/simpleQA
    Explore at:
    Dataset authored and provided by
    Oid Labs
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    oidlabs/simpleQA dataset hosted on Hugging Face and contributed by the HF Datasets community

  4. h

    simpleqa

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    China Merchants Research Institute Of Advanced Technology, simpleqa [Dataset]. https://huggingface.co/datasets/cmriat/simpleqa
    Explore at:
    Dataset authored and provided by
    China Merchants Research Institute Of Advanced Technology
    Description

    cmriat/simpleqa dataset hosted on Hugging Face and contributed by the HF Datasets community

  5. h

    simpleqa

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Llama Stack, simpleqa [Dataset]. https://huggingface.co/datasets/llamastack/simpleqa
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset authored and provided by
    Llama Stack
    Description

    llamastack/simpleqa dataset hosted on Hugging Face and contributed by the HF Datasets community

  6. P

    SimpleQuestions Dataset

    • paperswithcode.com
    Updated Aug 14, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Antoine Bordes; Nicolas Usunier; Sumit Chopra; Jason Weston (2021). SimpleQuestions Dataset [Dataset]. https://paperswithcode.com/dataset/simplequestions
    Explore at:
    Dataset updated
    Aug 14, 2021
    Authors
    Antoine Bordes; Nicolas Usunier; Sumit Chopra; Jason Weston
    Description

    SimpleQuestions is a large-scale factoid question answering dataset. It consists of 108,442 natural language questions, each paired with a corresponding fact from Freebase knowledge base. Each fact is a triple (subject, relation, object) and the answer to the question is always the object. The dataset is divided into training, validation, and test sets with 75,910, 10,845 and 21,687 questions respectively.

  7. d

    SimpleQA 大模型评测基准排行榜

    • datalearner.com
    Updated Oct 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    数据学习 (DataLearner) (2024). SimpleQA 大模型评测基准排行榜 [Dataset]. https://www.datalearner.com/ai-models/llm-benchmark-tests/33
    Explore at:
    Dataset updated
    Oct 15, 2024
    Dataset authored and provided by
    数据学习 (DataLearner)
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    基于 SimpleQA 基准的最新大语言模型(LLM)性能排行榜,包含各模型的得分、发布机构、发布时间等数据。

  8. h

    SimpleQA-RLVR-noprompt

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hamish Ivison, SimpleQA-RLVR-noprompt [Dataset]. https://huggingface.co/datasets/hamishivi/SimpleQA-RLVR-noprompt
    Explore at:
    Authors
    Hamish Ivison
    Description

    hamishivi/SimpleQA-RLVR-noprompt dataset hosted on Hugging Face and contributed by the HF Datasets community

  9. h

    SimpleQA-1000

    • huggingface.co
    Updated Mar 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrey Galichin (2025). SimpleQA-1000 [Dataset]. https://huggingface.co/datasets/andreuka18/SimpleQA-1000
    Explore at:
    Dataset updated
    Mar 23, 2025
    Authors
    Andrey Galichin
    Description

    andreuka18/SimpleQA-1000 dataset hosted on Hugging Face and contributed by the HF Datasets community

  10. h

    output_Llama-3.1-8B-simpleqa-0_1000-m_generation-n_128-t_1.0-k_50-p_0.95-l_128...

    • huggingface.co
    Updated Dec 25, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sen Yang (2024). output_Llama-3.1-8B-simpleqa-0_1000-m_generation-n_128-t_1.0-k_50-p_0.95-l_128 [Dataset]. https://huggingface.co/datasets/ringos/output_Llama-3.1-8B-simpleqa-0_1000-m_generation-n_128-t_1.0-k_50-p_0.95-l_128
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 25, 2024
    Authors
    Sen Yang
    Description

    ringos/output_Llama-3.1-8B-simpleqa-0_1000-m_generation-n_128-t_1.0-k_50-p_0.95-l_128 dataset hosted on Hugging Face and contributed by the HF Datasets community

  11. h

    simpleqa-llama3.1-8b-inst-safe-rlhf-0628-completions

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Noah Shen, simpleqa-llama3.1-8b-inst-safe-rlhf-0628-completions [Dataset]. https://huggingface.co/datasets/NoahShen/simpleqa-llama3.1-8b-inst-safe-rlhf-0628-completions
    Explore at:
    Authors
    Noah Shen
    Description

    NoahShen/simpleqa-llama3.1-8b-inst-safe-rlhf-0628-completions dataset hosted on Hugging Face and contributed by the HF Datasets community

  12. h

    simple_questions_v2

    • huggingface.co
    Updated May 24, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fethi BOUGARES (2024). simple_questions_v2 [Dataset]. https://huggingface.co/datasets/fbougares/simple_questions_v2
    Explore at:
    Dataset updated
    May 24, 2024
    Authors
    Fethi BOUGARES
    License

    Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
    License information was derived automatically

    Description

    SimpleQuestions is a dataset for simple QA, which consists of a total of 108,442 questions written in natural language by human English-speaking annotators each paired with a corresponding fact, formatted as (subject, relationship, object), that provides the answer but also a complete explanation. Fast have been extracted from the Knowledge Base Freebase (freebase.com). We randomly shuffle these questions and use 70% of them (75910) as training set, 10% as validation set (10845), and the remaining 20% as test set.

  13. h

    output_Mistral-Nemo-Base-2407-simpleqa-0_1000-m_generation-n_32-t_1.0-k_40-p_0.9-l_128...

    • huggingface.co
    Updated Feb 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sen Yang (2025). output_Mistral-Nemo-Base-2407-simpleqa-0_1000-m_generation-n_32-t_1.0-k_40-p_0.9-l_128 [Dataset]. https://huggingface.co/datasets/ringos/output_Mistral-Nemo-Base-2407-simpleqa-0_1000-m_generation-n_32-t_1.0-k_40-p_0.9-l_128
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 19, 2025
    Authors
    Sen Yang
    Description

    ringos/output_Mistral-Nemo-Base-2407-simpleqa-0_1000-m_generation-n_32-t_1.0-k_40-p_0.9-l_128 dataset hosted on Hugging Face and contributed by the HF Datasets community

  14. h

    SimpleQA-synthetic-datastore-Llama3.3-70B-Instruct

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rulin Shao, SimpleQA-synthetic-datastore-Llama3.3-70B-Instruct [Dataset]. https://huggingface.co/datasets/rulins/SimpleQA-synthetic-datastore-Llama3.3-70B-Instruct
    Explore at:
    Authors
    Rulin Shao
    Description

    Synthetic oracle datastore for SimpleQA. The oracle document is generated based on the problem and answer. This data is generated by Llama3.3-70B-Instruct. template = f""" You are a helpful assistant that can synthesize a Wikipedia document from a question and an answer. The document should be an actual Wikipedia article that can be helpful for answering the question. Do not directly include the question in the document. The document should contain around 150 words.

    Question: {{question}} … See the full description on the dataset page: https://huggingface.co/datasets/rulins/SimpleQA-synthetic-datastore-Llama3.3-70B-Instruct.

  15. h

    ACG-SimpleQA

    • huggingface.co
    Updated Apr 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Papersnake (2025). ACG-SimpleQA [Dataset]. https://huggingface.co/datasets/Papersnake/ACG-SimpleQA
    Explore at:
    Dataset updated
    Apr 24, 2025
    Authors
    Papersnake
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    ACG-SimpleQA

    🌐 Website • 🤗 Hugging Face

    中文 | English

    ACG-SimpleQA is an objective knowledge question-answering dataset focused on the Chinese ACG (Animation, Comic, Game) domain, containing 4242 auto-generated carefully designed QA samples. This benchmark aims to evaluate large language models' factual capabilities in the ACG culture domain, featuring Chinese language, diversity, high quality, static answers, and easy evaluation.

      📢 Latest Updates… See the full description on the dataset page: https://huggingface.co/datasets/Papersnake/ACG-SimpleQA.
    
  16. h

    openai_simple_qa_test_set

    • huggingface.co
    Updated Oct 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MAISA AI (2024). openai_simple_qa_test_set [Dataset]. https://huggingface.co/datasets/MAISAAI/openai_simple_qa_test_set
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 30, 2024
    Dataset provided by
    Maisa Inc.
    Authors
    MAISA AI
    Description

    Model Card: SimpleQA Benchmark

    Information from OpenAI blogpost Model Card for SimpleQAVersion: v1.0Date: October 30, 2024Authors: Jason Wei, Karina Nguyen, Hyung Won Chung, Joy Jiao, Spencer Papay, Mia Glaese, John Schulman, Liam FedusAcknowledgements: Adam Tauman Kalai

      Model Overview
    

    SimpleQA is a factuality benchmark designed to evaluate the accuracy and reliability of language models in responding to short, fact-seeking questions. Aimed at assessing models'… See the full description on the dataset page: https://huggingface.co/datasets/MAISAAI/openai_simple_qa_test_set.

  17. h

    synthetic-rag-simple-qa-4th-to-6th

    • huggingface.co
    Updated May 30, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lambent (2024). synthetic-rag-simple-qa-4th-to-6th [Dataset]. https://huggingface.co/datasets/Lambent/synthetic-rag-simple-qa-4th-to-6th
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 30, 2024
    Authors
    Lambent
    Description

    Lambent/synthetic-rag-simple-qa-4th-to-6th dataset hosted on Hugging Face and contributed by the HF Datasets community

  18. h

    simple-qa

    • huggingface.co
    Updated Apr 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Piotr Rybak (2023). simple-qa [Dataset]. https://huggingface.co/datasets/piotr-rybak/simple-qa
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 20, 2023
    Authors
    Piotr Rybak
    Description

    piotr-rybak/simple-qa dataset hosted on Hugging Face and contributed by the HF Datasets community

  19. h

    Wikipedia-Turkish-SimpleQA

    • huggingface.co
    Updated May 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Murat Tut (2025). Wikipedia-Turkish-SimpleQA [Dataset]. https://huggingface.co/datasets/kesitt/Wikipedia-Turkish-SimpleQA
    Explore at:
    Dataset updated
    May 28, 2025
    Authors
    Murat Tut
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Area covered
    Türkiye
    Description

    kesitt/Wikipedia-Turkish-SimpleQA dataset hosted on Hugging Face and contributed by the HF Datasets community

  20. h

    synthetic-rag-hermes-simple-qa-1st-ic

    • huggingface.co
    Updated Jun 16, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lambent (2024). synthetic-rag-hermes-simple-qa-1st-ic [Dataset]. https://huggingface.co/datasets/Lambent/synthetic-rag-hermes-simple-qa-1st-ic
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 16, 2024
    Authors
    Lambent
    Description

    Lambent/synthetic-rag-hermes-simple-qa-1st-ic dataset hosted on Hugging Face and contributed by the HF Datasets community

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
basicv8vc, SimpleQA [Dataset]. https://huggingface.co/datasets/basicv8vc/SimpleQA

SimpleQA

d

basicv8vc/SimpleQA

Explore at:
150 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Authors
basicv8vc
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

SimpleQA

A factuality benchmark called SimpleQA that measures the ability for language models to answer short, fact-seeking questions.

  Sources

openai/simple-evals Introducing SimpleQA Measuring short-form factuality in large language models

Search
Clear search
Close search
Google apps
Main menu