MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Card for "commonsense_qa"
Dataset Summary
CommonsenseQA is a new multiple-choice question answering dataset that requires different types of commonsense knowledge to predict the correct answers . It contains 12,102 questions with one correct answer and four distractor answers. The dataset is provided in two major training/validation/testing set splits: "Random split" which is the main evaluation split, and "Question token split", see paper for details.… See the full description on the dataset page: https://huggingface.co/datasets/tau/commonsense_qa.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Commonsense QA CoT (Partial, Raw, No Human Annotation)
Dataset Summary
Seeded by the CommonsenseQA dataset (tau/commonsense_qa) this preliminary set randomly samples 1,000 question-answer entries and uses Mixtral (mistralai/Mixtral-8x7B-Instruct-v0.1) to generate 3 unique CoT (Chain-of-Thought) rationales. This was created as the preliminary step towards fine-tuning a LM (language model) to specialize on commonsense reasoning. The working hypothesis, inspired by the… See the full description on the dataset page: https://huggingface.co/datasets/peterkchung/commonsense_cot_partial_raw.
CommonsenseQA is a new multiple-choice question answering dataset that requires different types of commonsense knowledge to predict the correct answers . It contains 12,102 questions with one correct answer and four distractor answers. The dataset is provided in two major training/validation/testing set splits: "Random split" which is the main evaluation split, and "Question token split", see paper for details.
https://www.tau-nlp.org/commonsenseqa
https://arxiv.org/abs/1811.00937
https://media.giphy.com/media/YknAouVrcbkiDvWUOR/giphy.gif" alt="Alt Text">
https://media.giphy.com/media/26xBtSyoi5hUUkCEo/giphy.gif" alt="Alt Text">
https://media.giphy.com/media/4LiMmbAcvgTQs/giphy.gif" alt="Alt Text">
https://media.giphy.com/media/3o6Ztg5jGKDQSjaZ1K/giphy.gif" alt="Alt Text">
Commonsense QA CoT (Partial, Annotated) - PRELIMINARY
Dataset Summary
This dataset is a human-annotated subset of randomly sampled question-answer entries from the CommonsenseQA dataset (tau/commonsense_qa). The 'rationales' for each QA pair were created using a two-part method. First, Mixtral (mistralai/Mixtral-8x7B-Instruct-v0.1) was used to generate 3 unique CoT (Chain-of-Thought) explanations. Next, human evaluation was applied to distill the random sampling down to a… See the full description on the dataset page: https://huggingface.co/datasets/peterkchung/commonsense_cot_partial_annotated_prelim.
mnlp-nsoai/commonsenseqa-1000 dataset hosted on Hugging Face and contributed by the HF Datasets community
This repository contains the publicly released dataset, code, and models for the Explanations for CommonsenseQA paper presented at ACL-IJCNLP 2021. Directories data and code inside the root folder contain dataset and code, respectively. The same data and code are also made available through our AIHN collaboration partner institute IIT Delhi. You can download the full paper from here.
Note that these annotations are provided for the questions of the CommonsenseQA data (https://www.tau-nlp.org/commonsenseqa): arXiv:1811.00937 cs.CL.
Citations Please consider citing this paper as follows: @inproceedings{aggarwaletal2021ecqa, title={{E}xplanations for {C}ommonsense{QA}: {N}ew {D}ataset and {M}odels}, author={Shourya Aggarwal and Divyanshu Mandowara and Vishwajeet Agrawal and Dinesh Khandelwal and Parag Singla and Dinesh Garg}, booktitle="Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)}", Pages = 3050–3065, year = "2021", publisher = "Association for Computational Linguistics" }
zen-E/CommonsenseQA-GPT4omini dataset hosted on Hugging Face and contributed by the HF Datasets community
ACCORD CSQA is an extension of the popular CommonsenseQA (CSQA) dataset using ACCORD, a scalable framework for disentangling the commonsense grounding and reasoning abilities of large language models (LLMs) through controlled, multi-hop counterfactuals. ACCORD closes the measurability gap between commonsense and formal reasoning tasks for LLMs. A detailed understanding of LLMs' commonsense reasoning abilities is severely lagging compared to our understanding of their formal reasoning abilities, since commonsense benchmarks are difficult to construct in a manner that is rigorously quantifiable. Specifically, prior commonsense reasoning benchmarks and datasets are limited to one- or two-hop reasoning or include an unknown (i.e., non-measurable) number of reasoning hops and/or distractors. Arbitrary scalability via compositional construction is also typical of formal reasoning tasks but lacking in commonsense reasoning. Finally, most prior commonsense benchmarks either are limited to a single reasoning skill or do not control skills. ACCORD aims to address all these gaps by introducing formal elements to commonsense reasoning to explicitly control and quantify reasoning complexity beyond the typical 1 or 2 reasoning hops. Uniquely, ACCORD can automatically generate benchmarks of arbitrary reasoning complexity, and so it scales with future LLM improvements. ACCORD CSQA is a benchmark suite comprising problem with 6 levels of reasoning difficulty, ACCORD CSQA 0 to ACCORD CSQA 5. Experiments on state-of-the-art LLMs show performance degrading to random chance with only moderate scaling, leaving substantial headroom for improvement.
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Dataset Card for Multilingual CommonsenseQA (mCSQA)
This dataset expands CommonsenseQA to eight languages from scratch using the same approach with LLMs and humans.
Abstract
From mCSQA: Multilingual Commonsense Reasoning Dataset with Unified Creation Strategy by Language Models and Humans (Findings of ACL2024)
It is very challenging to curate a dataset for language-specific knowledge and common sense in order to evaluate natural language understanding capabilities of… See the full description on the dataset page: https://huggingface.co/datasets/yusuke1997/mCSQA.
carminho/commonsense_qa-mt-pt dataset hosted on Hugging Face and contributed by the HF Datasets community
MNLP M2 Quantized MCQA Dataset
Train split of prompt/completion examples with an extra dataset column indicating source.
Column Type Description
prompt string The input question prompt
completion string The ground-truth answer
dataset string Source label (e.g. scienceqa, M1_chatgpt, qasc, mathqa, commonsenseqa, openbookqa)
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Dataset Card for SU-CSQA
Dataset Details
Dataset Description
Repository: rifkiaputri/id-csqa Paper: Can LLM Generate Culturally Relevant Commonsense QA Data? Case Study in Indonesian and Sundanese Point of Contact: rifkiaputri License: Creative Commons Non-Commercial (CC BY-NC 4.0)
In our paper, we investigate the effectiveness of using LLMs in generating culturally relevant CommonsenseQA datasets for Indonesian and Sundanese languages. To do so, we… See the full description on the dataset page: https://huggingface.co/datasets/rifkiaputri/su-csqa.
https://choosealicense.com/licenses/llama3.1/https://choosealicense.com/licenses/llama3.1/
Dataset Card for Llama-3.1-405B Evaluation Result Details
This dataset contains the Meta evaluation result details for Llama-3.1-405B. The dataset has been created from 12 evaluation tasks. These tasks are triviaqa_wiki, mmlu_pro, commonsenseqa, winogrande, mmlu, boolq, squad, quac, drop, bbh, arc_challenge, agieval_english. Each task detail can be found as a specific subset in each configuration and each subset is named using the task name plus the timestamp of the upload time… See the full description on the dataset page: https://huggingface.co/datasets/meta-llama/Llama-3.1-405B-evals.
https://choosealicense.com/licenses/llama3.1/https://choosealicense.com/licenses/llama3.1/
Dataset Card for Llama-3.1-8B Evaluation Result Details
This dataset contains the Meta evaluation result details for Llama-3.1-8B. The dataset has been created from 12 evaluation tasks. These tasks are triviaqa_wiki, mmlu_pro, commonsenseqa, winogrande, mmlu, boolq, squad, quac, drop, bbh, arc_challenge, agieval_english. Each task detail can be found as a specific subset in each configuration and each subset is named using the task name plus the timestamp of the upload time and… See the full description on the dataset page: https://huggingface.co/datasets/meta-llama/Llama-3.1-8B-evals.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
評価スコアの再現性確保と SB Intuitions 修正版の公開用クローン ソース: yahoojapan/JGLUE on GitHub
datasets/jcommonsenseqa-v1.1
JCommonsenseQA
JCommonsenseQA is a Japanese version of CommonsenseQA (Talmor+, 2019), which is a multiple-choice question answering dataset that requires commonsense reasoning ability. It is built using crowdsourcing with seeds extracted from the knowledge base ConceptNet.
Licensing Information
Creative Commons Attribution Share Alike 4.0 International
Citation… See the full description on the dataset page: https://huggingface.co/datasets/sbintuitions/JCommonsenseQA.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Card for "kor_commonsense_qa"
More Information needed
Source Data Citation Information
@inproceedings{talmor-etal-2019-commonsenseqa, title = "{C}ommonsense{QA}: A Question Answering Challenge Targeting Commonsense Knowledge", author = "Talmor, Alon and Herzig, Jonathan and Lourie, Nicholas and Berant, Jonathan", booktitle = "Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational… See the full description on the dataset page: https://huggingface.co/datasets/KETI-AIR/kor_commonsense_qa.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Card for "commonsense_qa"
Dataset Summary
CommonsenseQA is a new multiple-choice question answering dataset that requires different types of commonsense knowledge to predict the correct answers . It contains 12,102 questions with one correct answer and four distractor answers. The dataset is provided in two major training/validation/testing set splits: "Random split" which is the main evaluation split, and "Question token split", see paper for details.… See the full description on the dataset page: https://huggingface.co/datasets/tau/commonsense_qa.