Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
Dataset Card for "sciq"
Dataset Summary
The SciQ dataset contains 13,679 crowdsourced science exam questions about Physics, Chemistry and Biology, among others. The questions are in multiple-choice format with 4 answer options each. For the majority of the questions, an additional paragraph with supporting evidence for the correct answer is provided.
Supported Tasks and Leaderboards
More Information Needed
Languages
More Information Neededโฆ See the full description on the dataset page: https://huggingface.co/datasets/allenai/sciq.
dhruvjwc/SciQ dataset hosted on Hugging Face and contributed by the HF Datasets community
llm-uncertainty-head/sciq dataset hosted on Hugging Face and contributed by the HF Datasets community
This dataset was created by Ziad Ayman
Youssefbou62/sciq dataset hosted on Hugging Face and contributed by the HF Datasets community
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset contains a collection of 13,679 crowdsourced science exam questions, primarily focusing on Physics, Chemistry, and Biology. The questions are presented in a multiple-choice format, each with four answer options. For the majority of the questions, an additional paragraph providing supporting evidence for the correct answer is also included. The dataset is designed to evaluate a person's knowledge of science and can be used for various research and application purposes.
The dataset is primarily available as a CSV file, specifically test.csv
, which is used for evaluation. It comprises 13,679 records or individual science exam questions. The exact file size is not detailed in the provided information, but its structure is consistent with a tabular format where each row represents a question and its associated data.
This dataset is ideally suited for evaluating scientific knowledge and for research in natural language processing (NLP). It can be particularly useful for: * Developing and training models to answer scientific questions. * Creating AI-powered educational tools for science learning. * Assessing human or AI performance on science examinations. * Generating insights into common distractors and improving question design.
The dataset offers global relevance as the scientific questions are not tied to a specific geographical region. It covers core science subjects including Physics, Chemistry, and Biology. No specific time range is indicated for the origin of the questions, suggesting they are general science concepts. There are no particular notes on data availability for specific demographic groups, as the focus is on subject matter knowledge.
CCO
The dataset is intended for a variety of users, including: * Researchers in AI, machine learning, and natural language processing to develop and test question-answering systems. * Educators and educational technology developers to create assessment tools or learning platforms. * Data scientists and analysts interested in text data analysis and knowledge representation. * Students undertaking projects related to scientific reasoning and AI.
Original Data Source: SciQ (Scientific Question Answering)
JeenaAT/sciq-qa-dataset_llama_template dataset hosted on Hugging Face and contributed by the HF Datasets community
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
About 10000 rewritten texts using Gemma 7b-it, the original texts from column "Support" in file train.csv from dataset SciQ (Scientific Question Answering)
if you find it useful, upvote it
themachinefan/sandbagging-sciq dataset hosted on Hugging Face and contributed by the HF Datasets community
jbloom-aisi/sandbagging-sciq dataset hosted on Hugging Face and contributed by the HF Datasets community
jbloom-aisi/gemma-2-2b-it-sciq dataset hosted on Hugging Face and contributed by the HF Datasets community
nlp-group-6/sciq-with-generated-questions dataset hosted on Hugging Face and contributed by the HF Datasets community
Nandini82/sciq-qa1 dataset hosted on Hugging Face and contributed by the HF Datasets community
pmdlt/sciq-text-only dataset hosted on Hugging Face and contributed by the HF Datasets community
reza-rgb/sciq-dpo-stem dataset hosted on Hugging Face and contributed by the HF Datasets community
jbloom-aisi/sandbagging-sciq-emulate-gemma-2-2b-it dataset hosted on Hugging Face and contributed by the HF Datasets community
ejenner/quirky_sciq_raw dataset hosted on Hugging Face and contributed by the HF Datasets community
Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
SciQ - Italian (IT)
This dataset is an Italian translation of SciQ. SciQ is a dataset for scientific questions, which were semi-automatically generated from an existing set of questions. The dataset is designed to test the ability of models to answer questions that require scientific knowledge.
Dataset Details
The dataset consists of science-related questions, where each question is associated with a correct answer and three possible distractors. The task is to predictโฆ See the full description on the dataset page: https://huggingface.co/datasets/sapienzanlp/sciq_italian.
Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
Dataset Card for "sciq"
Dataset Summary
The SciQ dataset contains 13,679 crowdsourced science exam questions about Physics, Chemistry and Biology, among others. The questions are in multiple-choice format with 4 answer options each. For the majority of the questions, an additional paragraph with supporting evidence for the correct answer is provided.
Supported Tasks and Leaderboards
More Information Needed
Languages
More Information Neededโฆ See the full description on the dataset page: https://huggingface.co/datasets/allenai/sciq.