29 datasets found
  1. Data from: sciq

    • huggingface.co
    • paperswithcode.com
    • +1more
    Updated Mar 1, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ai2 (2024). sciq [Dataset]. https://huggingface.co/datasets/allenai/sciq
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 1, 2024
    Dataset provided by
    Allen Institute for AIhttp://allenai.org/
    Authors
    Ai2
    License

    Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
    License information was derived automatically

    Description

    Dataset Card for "sciq"

      Dataset Summary
    

    The SciQ dataset contains 13,679 crowdsourced science exam questions about Physics, Chemistry and Biology, among others. The questions are in multiple-choice format with 4 answer options each. For the majority of the questions, an additional paragraph with supporting evidence for the correct answer is provided.

      Supported Tasks and Leaderboards
    

    More Information Needed

      Languages
    

    More Information Neededโ€ฆ See the full description on the dataset page: https://huggingface.co/datasets/allenai/sciq.

  2. h

    Data from: SciQ

    • huggingface.co
    Updated Jun 8, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dhruv C (2023). SciQ [Dataset]. https://huggingface.co/datasets/dhruvjwc/SciQ
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 8, 2023
    Authors
    Dhruv C
    Description

    dhruvjwc/SciQ dataset hosted on Hugging Face and contributed by the HF Datasets community

  3. h

    Data from: sciq

    • huggingface.co
    Updated May 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    llm-uncertainty-head (2025). sciq [Dataset]. https://huggingface.co/datasets/llm-uncertainty-head/sciq
    Explore at:
    Dataset updated
    May 29, 2025
    Dataset authored and provided by
    llm-uncertainty-head
    Description

    llm-uncertainty-head/sciq dataset hosted on Hugging Face and contributed by the HF Datasets community

  4. h

    eval-sciq

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yonathan, eval-sciq [Dataset]. https://huggingface.co/datasets/c0ntrolZ/eval-sciq
    Explore at:
    Authors
    Yonathan
    Description

    c0ntrolZ/eval-sciq dataset hosted on Hugging Face and contributed by the HF Datasets community

  5. SCIQ for NLP

    • kaggle.com
    Updated Mar 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ziad Ayman (2024). SCIQ for NLP [Dataset]. https://www.kaggle.com/datasets/ziadaymantesla/filtered/suggestions
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 13, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Ziad Ayman
    Description

    Dataset

    This dataset was created by Ziad Ayman

    Contents

  6. h

    Data from: sciq

    • huggingface.co
    Updated Jun 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Youssef Boughizane (2025). sciq [Dataset]. https://huggingface.co/datasets/Youssefbou62/sciq
    Explore at:
    Dataset updated
    Jun 1, 2025
    Authors
    Youssef Boughizane
    Description

    Youssefbou62/sciq dataset hosted on Hugging Face and contributed by the HF Datasets community

  7. o

    Scientific Knowledge Evaluation Dataset

    • opendatabay.com
    .undefined
    Updated Jul 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Datasimple (2025). Scientific Knowledge Evaluation Dataset [Dataset]. https://www.opendatabay.com/data/ai-ml/606f5704-f0a7-4949-810d-443d020dd438
    Explore at:
    .undefinedAvailable download formats
    Dataset updated
    Jul 4, 2025
    Dataset authored and provided by
    Datasimple
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Education & Learning Analytics
    Description

    This dataset contains a collection of 13,679 crowdsourced science exam questions, primarily focusing on Physics, Chemistry, and Biology. The questions are presented in a multiple-choice format, each with four answer options. For the majority of the questions, an additional paragraph providing supporting evidence for the correct answer is also included. The dataset is designed to evaluate a person's knowledge of science and can be used for various research and application purposes.

    Columns

    • question: The main text of the scientific question. (String)
    • distractor3: One of the incorrect answer options designed to distract the test taker. (String)
    • distractor1: Another incorrect answer option. (String)
    • distractor2: A third incorrect answer option. (String)
    • correct_answer: The accurate answer to the question. (String)
    • support: Supplementary text that provides evidence or further context for the correct answer, helping users understand the question. (String)

    Distribution

    The dataset is primarily available as a CSV file, specifically test.csv, which is used for evaluation. It comprises 13,679 records or individual science exam questions. The exact file size is not detailed in the provided information, but its structure is consistent with a tabular format where each row represents a question and its associated data.

    Usage

    This dataset is ideally suited for evaluating scientific knowledge and for research in natural language processing (NLP). It can be particularly useful for: * Developing and training models to answer scientific questions. * Creating AI-powered educational tools for science learning. * Assessing human or AI performance on science examinations. * Generating insights into common distractors and improving question design.

    Coverage

    The dataset offers global relevance as the scientific questions are not tied to a specific geographical region. It covers core science subjects including Physics, Chemistry, and Biology. No specific time range is indicated for the origin of the questions, suggesting they are general science concepts. There are no particular notes on data availability for specific demographic groups, as the focus is on subject matter knowledge.

    License

    CCO

    Who Can Use It

    The dataset is intended for a variety of users, including: * Researchers in AI, machine learning, and natural language processing to develop and test question-answering systems. * Educators and educational technology developers to create assessment tools or learning platforms. * Data scientists and analysts interested in text data analysis and knowledge representation. * Students undertaking projects related to scientific reasoning and AI.

    Dataset Name Suggestions

    • Scientific Knowledge Evaluation Dataset
    • Science Exam Questions Collection
    • Multi-Choice Science Questions
    • SciQ Science Questions and Answers
    • AI Science Question-Answering Corpus

    Attributes

    Original Data Source: SciQ (Scientific Question Answering)

  8. h

    sciq-qa-dataset_llama_template

    • huggingface.co
    Updated May 2, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jeena A Thankachan (2024). sciq-qa-dataset_llama_template [Dataset]. https://huggingface.co/datasets/JeenaAT/sciq-qa-dataset_llama_template
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 2, 2024
    Authors
    Jeena A Thankachan
    Description

    JeenaAT/sciq-qa-dataset_llama_template dataset hosted on Hugging Face and contributed by the HF Datasets community

  9. 10K rewritten texts dataset/LLM Prompt Recovery

    • kaggle.com
    zip
    Updated Apr 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aisha AL Mahmoud (2024). 10K rewritten texts dataset/LLM Prompt Recovery [Dataset]. https://www.kaggle.com/datasets/aishaalmahmoud/10k-rewritten-texts-datasetllm-prompt-recovery
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Apr 8, 2024
    Authors
    Aisha AL Mahmoud
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    About 10000 rewritten texts using Gemma 7b-it, the original texts from column "Support" in file train.csv from dataset SciQ (Scientific Question Answering)

    if you find it useful, upvote it

  10. h

    sandbagging-sciq

    • huggingface.co
    Updated Mar 7, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rob G (2024). sandbagging-sciq [Dataset]. https://huggingface.co/datasets/themachinefan/sandbagging-sciq
    Explore at:
    Dataset updated
    Mar 7, 2024
    Authors
    Rob G
    Description

    themachinefan/sandbagging-sciq dataset hosted on Hugging Face and contributed by the HF Datasets community

  11. h

    sandbagging-sciq

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joseph Bloom, sandbagging-sciq [Dataset]. https://huggingface.co/datasets/jbloom-aisi/sandbagging-sciq
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Joseph Bloom
    Description

    jbloom-aisi/sandbagging-sciq dataset hosted on Hugging Face and contributed by the HF Datasets community

  12. h

    gemma-2-2b-it-sciq

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joseph Bloom, gemma-2-2b-it-sciq [Dataset]. https://huggingface.co/datasets/jbloom-aisi/gemma-2-2b-it-sciq
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Joseph Bloom
    Description

    jbloom-aisi/gemma-2-2b-it-sciq dataset hosted on Hugging Face and contributed by the HF Datasets community

  13. h

    sciq-with-generated-questions

    • huggingface.co
    Updated Feb 24, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NLP Group 6 (2013). sciq-with-generated-questions [Dataset]. https://huggingface.co/datasets/nlp-group-6/sciq-with-generated-questions
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 24, 2013
    Dataset authored and provided by
    NLP Group 6
    Description

    nlp-group-6/sciq-with-generated-questions dataset hosted on Hugging Face and contributed by the HF Datasets community

  14. h

    sciq-qa1

    • huggingface.co
    Updated May 2, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    B (2024). sciq-qa1 [Dataset]. https://huggingface.co/datasets/Nandini82/sciq-qa1
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 2, 2024
    Authors
    B
    Description

    Nandini82/sciq-qa1 dataset hosted on Hugging Face and contributed by the HF Datasets community

  15. h

    sciq-text-only

    • huggingface.co
    Updated Jun 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Paul (2025). sciq-text-only [Dataset]. https://huggingface.co/datasets/pmdlt/sciq-text-only
    Explore at:
    Dataset updated
    Jun 8, 2025
    Authors
    Paul
    Description

    pmdlt/sciq-text-only dataset hosted on Hugging Face and contributed by the HF Datasets community

  16. h

    sciq-mcqa

    • huggingface.co
    Updated Jun 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Viola (2025). sciq-mcqa [Dataset]. https://huggingface.co/datasets/viols/sciq-mcqa
    Explore at:
    Dataset updated
    Jun 1, 2025
    Authors
    Viola
    Description

    viols/sciq-mcqa dataset hosted on Hugging Face and contributed by the HF Datasets community

  17. h

    sciq-dpo-stem

    • huggingface.co
    Updated Jun 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Reza (2025). sciq-dpo-stem [Dataset]. https://huggingface.co/datasets/reza-rgb/sciq-dpo-stem
    Explore at:
    Dataset updated
    Jun 1, 2025
    Authors
    Reza
    Description

    reza-rgb/sciq-dpo-stem dataset hosted on Hugging Face and contributed by the HF Datasets community

  18. h

    sandbagging-sciq-emulate-gemma-2-2b-it

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joseph Bloom, sandbagging-sciq-emulate-gemma-2-2b-it [Dataset]. https://huggingface.co/datasets/jbloom-aisi/sandbagging-sciq-emulate-gemma-2-2b-it
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Joseph Bloom
    Description

    jbloom-aisi/sandbagging-sciq-emulate-gemma-2-2b-it dataset hosted on Hugging Face and contributed by the HF Datasets community

  19. h

    quirky_sciq_raw

    • huggingface.co
    Updated Apr 20, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Erik Jenner (2024). quirky_sciq_raw [Dataset]. https://huggingface.co/datasets/ejenner/quirky_sciq_raw
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 20, 2024
    Authors
    Erik Jenner
    Description

    ejenner/quirky_sciq_raw dataset hosted on Hugging Face and contributed by the HF Datasets community

  20. h

    sciq_italian

    • huggingface.co
    Updated Dec 4, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sapienza NLP, Sapienza University of Rome (2024). sciq_italian [Dataset]. https://huggingface.co/datasets/sapienzanlp/sciq_italian
    Explore at:
    Dataset updated
    Dec 4, 2024
    Dataset authored and provided by
    Sapienza NLP, Sapienza University of Rome
    License

    Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
    License information was derived automatically

    Description

    SciQ - Italian (IT)

    This dataset is an Italian translation of SciQ. SciQ is a dataset for scientific questions, which were semi-automatically generated from an existing set of questions. The dataset is designed to test the ability of models to answer questions that require scientific knowledge.

      Dataset Details
    

    The dataset consists of science-related questions, where each question is associated with a correct answer and three possible distractors. The task is to predictโ€ฆ See the full description on the dataset page: https://huggingface.co/datasets/sapienzanlp/sciq_italian.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Ai2 (2024). sciq [Dataset]. https://huggingface.co/datasets/allenai/sciq
Organization logo

Data from: sciq

SciQ

allenai/sciq

Related Article
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 1, 2024
Dataset provided by
Allen Institute for AIhttp://allenai.org/
Authors
Ai2
License

Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically

Description

Dataset Card for "sciq"

  Dataset Summary

The SciQ dataset contains 13,679 crowdsourced science exam questions about Physics, Chemistry and Biology, among others. The questions are in multiple-choice format with 4 answer options each. For the majority of the questions, an additional paragraph with supporting evidence for the correct answer is provided.

  Supported Tasks and Leaderboards

More Information Needed

  Languages

More Information Neededโ€ฆ See the full description on the dataset page: https://huggingface.co/datasets/allenai/sciq.

Search
Clear search
Close search
Google apps
Main menu