PIQA is a dataset for commonsense reasoning, and was created to investigate the physical knowledge of existing models in NLP.
vikhyatk/piqa dataset hosted on Hugging Face and contributed by the HF Datasets community
Physical IQa: Physical Interaction QA, a new commonsense QA benchmark for naive physics reasoning focusing on how we interact with everyday objects in everyday situations. This dataset focuses on affordances of objects, i.e., what actions each physical object affords (e.g., it is possible to use a shoe as a doorstop), and what physical interactions a group of objects afford (e.g., it is possible to place an apple on top of a book, but not the other way around). The dataset requires reasoning about both the prototypical use of objects (e.g., shoes are used for walking) and non-prototypical but practically plausible use of objects (e.g., shoes can be used as a doorstop). The dataset includes 20,000 QA pairs that are either multiple-choice or true/false questions.
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('piqa', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
To apply eyeshadow without a brush, should I use a cotton swab or a toothpick?
Questions requiring this kind of physical commonsense pose a challenge to state-of-the-art
natural language understanding systems. The PIQA dataset introduces the task of physical commonsense reasoning
and a corresponding benchmark dataset Physical Interaction: Question Answering or PIQA.
Physical commonsense knowledge is a major challenge on the road to true AI-completeness,
including robots that interact with the world and understand natural language.
PIQA focuses on everyday situations with a preference for atypical solutions.
The dataset is inspired by instructables.com, which provides users with instructions on how to build, craft,
bake, or manipulate objects using everyday materials.
The underlying task is formualted as multiple choice question answering:
given a question q
and two possible solutions s1
, s2
, a model or
a human must choose the most appropriate solution, of which exactly one is correct.
The dataset is further cleaned of basic artifacts using the AFLite algorithm which is an improvement of
adversarial filtering. The dataset contains 16,000 examples for training, 2,000 for development and 3,000 for testing.
extraordinarylab/piqa dataset hosted on Hugging Face and contributed by the HF Datasets community
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
bengaliAI/PIQA dataset hosted on Hugging Face and contributed by the HF Datasets community
RAR-b/piqa dataset hosted on Hugging Face and contributed by the HF Datasets community
1-800-LLMs/piqa dataset hosted on Hugging Face and contributed by the HF Datasets community
https://choosealicense.com/licenses/afl-3.0/https://choosealicense.com/licenses/afl-3.0/
Dataset Card for PIQA-eu
Point of Contact: hitz@ehu.eus
Dataset Description
Dataset Summary
PIQA-eu is the professional translation to Basque of the PIQA's (Bisk et al., 2020) validation partition. PIQA is a commonsense QA benchmark for naive physics reasoning focusing on how we interact with everyday objects in everyday situations.
Languages
eu-ES
Dataset Structure
Data Instances
PIQA-eu examples look like this: {… See the full description on the dataset page: https://huggingface.co/datasets/HiTZ/PIQA-eu.
baber/piqa dataset hosted on Hugging Face and contributed by the HF Datasets community
dogtooth/piqa dataset hosted on Hugging Face and contributed by the HF Datasets community
Narpear/piqa dataset hosted on Hugging Face and contributed by the HF Datasets community
https://choosealicense.com/licenses/afl-3.0/https://choosealicense.com/licenses/afl-3.0/
Dataset Card for piqa_ca
piqa_ca is a multiple choice question answering dataset in Catalan that has been professionally translated from the PIQA validation set in English.
Dataset Details
Dataset Description
piqa_ca (Physical Interaction Question Answering - Catalan) is designed to evaluate physical commonsense reasoning using question-answer triplets based on everyday situations. It includes 1838 instances in the validation split. Each instance contains… See the full description on the dataset page: https://huggingface.co/datasets/projecte-aina/piqa_ca.
This dataset is generated by Lilac for a HuggingFace Space: huggingface.co/spaces/lilacai/lilac. Original dataset: https://huggingface.co/datasets/piqa Lilac dataset config: name: piqa source: dataset_name: piqa source_name: huggingface embeddings: - path: goal embedding: gte-small - path: sol1 embedding: gte-small - path: sol2 embedding: gte-small signals: - path: goal signal: signal_name: near_dup - path: goal signal: signal_name: pii - path:… See the full description on the dataset page: https://huggingface.co/datasets/lilacai/lilac-piqa.
aq1048576/piqa-unsupervised-elicitation dataset hosted on Hugging Face and contributed by the HF Datasets community
https://choosealicense.com/licenses/afl-3.0/https://choosealicense.com/licenses/afl-3.0/
PIQA An MTEB dataset Massive Text Embedding Benchmark
Measuring the ability to retrieve the groundtruth answers to reasoning task queries on PIQA.
Task category t2t
Domains Encyclopaedic, Written
Reference https://arxiv.org/abs/1911.11641
How to evaluate on this task
You can evaluate an embedding model on this dataset using the following code: import mteb
task = mteb.get_task("PIQA") evaluator = mteb.MTEB([task])
model = mteb.get_model(YOUR_MODEL)… See the full description on the dataset page: https://huggingface.co/datasets/mteb/PIQA.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
piqa
Dataset Description
This dataset contains evaluation results for piqa with label column N_A, with various model performance metrics and samples.
Dataset Summary
The dataset contains original samples from the evaluation process, along with metadata like model names, input columns, and scores. This helps with understanding model performance across different tasks and datasets.
Features
id: Unique identifier for the sample. user: User… See the full description on the dataset page: https://huggingface.co/datasets/gallifantjack/piqa_N_A.
PIQA MK version
This dataset is a Macedonian adaptation of the PIQA dataset, originally curated (English -> Serbian) by Aleksa Gordić. It was translated from Serbian to Macedonian using the Google Translate API. You can find this dataset as part of the macedonian-llm-eval GitHub and HuggingFace.
Why Translate from Serbian?
The Serbian dataset was selected as the source instead of English because Serbian and Macedonian are closer from a linguistic standpoint, making… See the full description on the dataset page: https://huggingface.co/datasets/LVSTCK/piqa-mk.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Summary
This is the translated version of the PIQA LLM evaluation dataset. The dataset was translated using a new method called Expressive Semantic Translation (EST), which combines Google Translation with LLM-based rewriting. PIQA introduces the task of physical commonsense reasoning and provides a corresponding benchmark for understanding physical interactions in everyday situations. It focuses on atypical solutions to practical problems, inspired by instructional guides… See the full description on the dataset page: https://huggingface.co/datasets/hishab/piqa-bn.
carminho/piqa-mt-pt dataset hosted on Hugging Face and contributed by the HF Datasets community
PIQA is a dataset for commonsense reasoning, and was created to investigate the physical knowledge of existing models in NLP.