Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Card for "commonsense_qa"
Dataset Summary
CommonsenseQA is a new multiple-choice question answering dataset that requires different types of commonsense knowledge to predict the correct answers . It contains 12,102 questions with one correct answer and four distractor answers. The dataset is provided in two major training/validation/testing set splits: "Random split" which is the main evaluation split, and "Question token split", see paper for details.… See the full description on the dataset page: https://huggingface.co/datasets/tau/commonsense_qa.
Facebook
TwitterThis dataset was created by Darien Schettler
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Commonsense QA CoT (Partial, Raw, No Human Annotation)
Dataset Summary
Seeded by the CommonsenseQA dataset (tau/commonsense_qa) this preliminary set randomly samples 1,000 question-answer entries and uses Mixtral (mistralai/Mixtral-8x7B-Instruct-v0.1) to generate 3 unique CoT (Chain-of-Thought) rationales. This was created as the preliminary step towards fine-tuning a LM (language model) to specialize on commonsense reasoning. The working hypothesis, inspired by the… See the full description on the dataset page: https://huggingface.co/datasets/peterkchung/commonsense_cot_partial_raw.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Huggingface Hub: link
The Cosmos QA dataset is a large-scale dataset of 35.6K problems that require commonsense-based reading comprehension, formulated as multiple-choice questions. The dataset focuses on reading between the lines over a diverse collection of people's everyday narratives, asking questions concerning on the likely causes or effects of events that require reasoning beyond the exact text spans in the context.
This allows for much more sophisticated models to be built and evaluated, and could lead to better performance on real-world tasks
In order to use the Cosmos QA dataset, you will need to first download the data files from the Kaggle website. Once you have downloaded the files, you will need to unzip them and then place them in a directory on your computer.
Once you have the data files placed on your computer, you can begin using the dataset for commonsense-based reading comprehension tasks. The first step is to load the context file into a text editor such as Microsoft Word or Adobe Acrobat Reader. Once the context file is open, you will need to locate the section of text that contains the question that you want to answer.
Once you have located the section of text containing the question, you will need to read through thecontext in order to determine what type of answer would be most appropriate. After carefully reading throughthe context, you should then look at each of the answer choices and selectthe one that best fits with what you have read
- This dataset can be used to develop and evaluate commonsense-based reading comprehension models.
- This dataset can be used to improve and customize question answering systems for educational or customer service applications.
- This dataset can be used to study how human beings process and understand narratives, in order to better design artificial intelligence systems that can do the same
License
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
File: validation.csv | Column name | Description | |:--------------|:---------------------------------------------| | context | The context of the question. (String) | | answer0 | The first answer option. (String) | | answer1 | The second answer option. (String) | | answer2 | The third answer option. (String) | | answer3 | The fourth answer option. (String) | | label | The correct answer to the question. (String) |
File: train.csv | Column name | Description | |:--------------|:---------------------------------------------| | context | The context of the question. (String) | | answer0 | The first answer option. (String) | | answer1 | The second answer option. (String) | | answer2 | The third answer option. (String) | | answer3 | The fourth answer option. (String) | | label | The correct answer to the question. (String) |
File: test.csv | Column name | Description | |:--------------|:---------------------------------------------| | context | The context of the question. (String) | | answer0 | The first answer option. (String) | | answer1 | The second answer option. (String) | | answer2 | The third answer option. (String) | | answer3 | The fourth answer option. (String) | | label | The correct answer to the question. (String) |
Facebook
TwitterAn AI assistant for common sense QA.
Facebook
Twitternerdai/fedrag-commonsense-qa dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterCosmos QA is a large-scale dataset of 35.6K problems that require commonsense-based reading comprehension, formulated as multiple-choice questions. It focuses on reading between the lines over a diverse collection of people's everyday narratives, asking questions concerning on the likely causes or effects of events that require reasoning beyond the exact text spans in the context.
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('cosmos_qa', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This dataset contains 1789 data instances with problem identification, missing resource, time-dependent questions and answers pairs for disaster management.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Huggingface Hub: link
We introduce Social IQa: Social Interaction QA, a new question-answering benchmark for testing social commonsense intelligence. Contrary to many prior benchmarks that focus on physical or taxonomic knowledge, Social IQa focuses on reasoning about people’s actions and their social implications. For example, given an action like "Jesse saw a concert" and a question like "Why did Jesse do this?", humans can easily infer that Jesse wanted "to see their favorite performer" or "to enjoy the music", and not "to see what's happening inside" or "to see if it works". The actions in Social IQa span a wide variety of social situations, and answer candidates contain both human-curated answers and adversarially-filtered machine-generated candidates. Social IQa contains over 37,000 QA pairs for evaluating models’ abilities to reason about the social implications of everyday events and situations. (Less)
This dataset can be used to train and test models for social inquiry question answering. The questions and answers in the dataset have been annotations by experts, and the dataset has been verified for accuracy.
- The dataset can be used to train a model to answer questions about social topics.
- The dataset can be used to improve question-answering systems for social inquiry.
- The dataset can be used to generate new questions about social topics
Huggingface Hub: link
License
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
File: validation.csv | Column name | Description | |:--------------|:------------------------------------------------------| | context | The context of the question. (String) | | answerA | One of the possible answers to the question. (String) | | answerB | One of the possible answers to the question. (String) | | answerC | One of the possible answers to the question. (String) | | label | The correct answer to the question. (String) |
File: train.csv | Column name | Description | |:--------------|:------------------------------------------------------| | context | The context of the question. (String) | | answerA | One of the possible answers to the question. (String) | | answerB | One of the possible answers to the question. (String) | | answerC | One of the possible answers to the question. (String) | | label | The correct answer to the question. (String) |
Facebook
TwitterCommonsense QA CoT (Partial, Annotated) - PRELIMINARY
Dataset Summary
This dataset is a human-annotated subset of randomly sampled question-answer entries from the CommonsenseQA dataset (tau/commonsense_qa). The 'rationales' for each QA pair were created using a two-part method. First, Mixtral (mistralai/Mixtral-8x7B-Instruct-v0.1) was used to generate 3 unique CoT (Chain-of-Thought) explanations. Next, human evaluation was applied to distill the random sampling down to a… See the full description on the dataset page: https://huggingface.co/datasets/peterkchung/commonsense_cot_partial_annotated_prelim.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Huggingface Hub: link
CommonsenseQA is a new multiple-choice question answering dataset that requires different types of commonsense knowledge to predict the correct answers . It contains 12,102 questions with one correct answer and four distractor answers. The dataset is provided in two major training/validation/testing set splits: "Random split" which is the main evaluation split, and "Question token split", see paper for details.
- This dataset can be used to train a model to predict the correct answers to multiple-choice questions.
- This dataset can be used to evaluate the performance of different models on the CommonsenseQA dataset.
- This dataset can be used to discover new types of commonsense knowledge required to predict the correct answers to questions in the CommonsenseQA dataset
License
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
File: validation.csv | Column name | Description | |:--------------|:---------------------------------------------------------------| | answerKey | The correct answer to the question. (String) | | choices | The four possible answers for each question. (List of strings) |
File: train.csv | Column name | Description | |:--------------|:---------------------------------------------------------------| | answerKey | The correct answer to the question. (String) | | choices | The four possible answers for each question. (List of strings) |
File: test.csv | Column name | Description | |:--------------|:---------------------------------------------------------------| | answerKey | The correct answer to the question. (String) | | choices | The four possible answers for each question. (List of strings) |
Facebook
TwitterPhysical IQa: Physical Interaction QA, a new commonsense QA benchmark for naive physics reasoning focusing on how we interact with everyday objects in everyday situations. This dataset focuses on affordances of objects, i.e., what actions each physical object affords (e.g., it is possible to use a shoe as a doorstop), and what physical interactions a group of objects afford (e.g., it is possible to place an apple on top of a book, but not the other way around). The dataset requires reasoning about both the prototypical use of objects (e.g., shoes are used for walking) and non-prototypical but practically plausible use of objects (e.g., shoes can be used as a doorstop). The dataset includes 20,000 QA pairs that are either multiple-choice or true/false questions.
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('piqa', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
Facebook
TwitterAn AI tool for reasoning and common sense question-answering tasks.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Huggingface Hub: link
OpenBookQA aims to promote research in advanced question-answering, probing a deeper understanding of both the topic (with salient facts summarized as an open book, also provided with the dataset) and the language it is expressed in. In particular, it contains questions that require multi-step reasoning, use of additional common and commonsense knowledge, and rich text comprehension. OpenBookQA is a new kind of question-answering dataset modeled after open book exams for assessing human understanding of a subject.
With OpenBookQA, we hope to push the boundaries of what current QA models can do and advance the state-of-the-art in this field. In addition to providing a challenging benchmark for existing models, we hope that this dataset will encourage new model architectures that can better handle complex questions and reasoning
- Questions that require multi-step reasoning,
- Use of additional common and commonsense knowledge,
- Rich text comprehension
License
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
File: main_test.csv | Column name | Description | |:------------------|:-----------------------------------------------------------------------------------------------| | question_stem | The column 'question_stem' contains the stem of the question. (String) | | choices | The column 'choices' contains a list of answers to choose from. (List) | | answerKey | The column 'answerKey' contains the index of the correct answer in the choices list. (Integer) |
File: main_train.csv | Column name | Description | |:------------------|:-----------------------------------------------------------------------------------------------| | question_stem | The column 'question_stem' contains the stem of the question. (String) | | choices | The column 'choices' contains a list of answers to choose from. (List) | | answerKey | The column 'answerKey' contains the index of the correct answer in the choices list. (Integer) |
File: additional_train.csv | Column name | Description | |:------------------|:-----------------------------------------------------------------------------------------------| | question_stem | The column 'question_stem' contains the stem of the question. (String) | | choices | The column 'choices' contains a list of answers to choose from. (List) | | answerKey | The column 'answerKey' contains the index of the correct answer in the choices list. (Integer) |
File: additional_test.csv | Column name | Description | |:------------------|:-----------------------------------------------------------------------------------------------| | question_stem | The column 'question_stem' contains the stem of the question. (String) | | choices | The column 'choices' contains a list of answers to choose from. (List) | | answerKey | The column 'answerKey' contains the index of the correct answer in the choices list. (Integer) |
File: additional_validation.csv | Column name | Description | |:------------------|:-----------------------------------------------------------------------------------------------| | question_stem | The column 'question_stem' contains the stem of the question. (String) | | choices | The column 'choices' contains a list of answers to choose from. (List) | | answerKey | The column 'answerKey' contains the index of the correct answer in the choices list. (Integer) |
**File: ...
Facebook
TwitterWe introduce Social IQa: Social Interaction QA, a new question-answering benchmark for testing social commonsense intelligence. Contrary to many prior benchmarks that focus on physical or taxonomic knowledge, Social IQa focuses on reasoning about people’s actions and their social implications. For example, given an action like "Jesse saw a concert" and a question like "Why did Jesse do this?", humans can easily infer that Jesse wanted "to see their favorite performer" or "to enjoy the music", and not "to see what's happening inside" or "to see if it works". The actions in Social IQa span a wide variety of social situations, and answer candidates contain both human-curated answers and adversarially-filtered machine-generated candidates. Social IQa contains over 37,000 QA pairs for evaluating models’ abilities to reason about the social implications of everyday events and situations. (Less)
Facebook
TwitterHuggingFace Dataset by MorishT
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
評価スコアの再現性確保と SB Intuitions 修正版の公開用クローン ソース: yahoojapan/JGLUE on GitHub
datasets/jcommonsenseqa-v1.1
JCommonsenseQA
JCommonsenseQA is a Japanese version of CommonsenseQA (Talmor+, 2019), which is a multiple-choice question answering dataset that requires commonsense reasoning ability. It is built using crowdsourcing with seeds extracted from the knowledge base ConceptNet.
Licensing Information
Creative Commons Attribution Share Alike 4.0 International
Citation… See the full description on the dataset page: https://huggingface.co/datasets/sbintuitions/JCommonsenseQA.
Facebook
TwitterSynthetic CommonSense
Generated using ChatGPT4, originally from https://huggingface.co/datasets/commonsense_qa Notebook at https://github.com/mesolitica/malaysian-dataset/tree/master/question-answer/chatgpt4-commonsense
synthetic-commonsense.jsonl, 36332 rows, 7.34 MB.
Example data
{'question': '1. Seseorang yang bersara mungkin perlu kembali bekerja jika mereka apa? A. mempunyai hutang B. mencari pendapatan C. meninggalkan pekerjaan D. memerlukan… See the full description on the dataset page: https://huggingface.co/datasets/mesolitica/chatgpt4-commonsense-qa.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
ChilleD/StrategyQA dataset hosted on Hugging Face and contributed by the HF Datasets community
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Card for "commonsense_qa"
Dataset Summary
CommonsenseQA is a new multiple-choice question answering dataset that requires different types of commonsense knowledge to predict the correct answers . It contains 12,102 questions with one correct answer and four distractor answers. The dataset is provided in two major training/validation/testing set splits: "Random split" which is the main evaluation split, and "Question token split", see paper for details.… See the full description on the dataset page: https://huggingface.co/datasets/tau/commonsense_qa.