Dataset Card for "A-OKVQA"
More Information needed
This dataset was created by ZhangJoyee
OK-VQA is a new dataset for visual question answering that requires methods which can draw upon outside knowledge to answer questions. - 14,055 open-ended questions - 5 ground truth answers per question - Manually filtered to ensure all questions require outside knowledge (e.g. from Wikipeida) - Reduced questions with most common answers to reduce dataset bias
lmms-lab/OK-VQA dataset hosted on Hugging Face and contributed by the HF Datasets community
GY2233/geo170k-8k-r1-VisualPuzzles-TQA-ai2d-r1-RL-lmms-ScienceQA-IMG-A-OKVQA dataset hosted on Hugging Face and contributed by the HF Datasets community
https://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/
A-OK-VQA-eu (Basque Translation โข 5 K Sample)
๐ Overview
A-OK-VQA-eu is a Basque-language subset of the original A-OKVQA knowledge-based visual question-answering benchmark. A random sample of 5 000 English QA pairs was translated into Basque with HiTZ/Latxa-Llama-3.1-70B-Instruct; roughly 20 % of those translations were manually post-edited to ensure fluency and adequacy. Important: This is not the official dataset. It is an independent community translation intendedโฆ See the full description on the dataset page: https://huggingface.co/datasets/lukasArana/A-OK-VQA-eu.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Overview
This dataset supports high-quality Visual Question Answering (VQA) focused on knowledge-based reasoning. Each sample is annotated with structured reasoning steps, progressing from image observation to external knowledge retrieval and answer derivation. It is constructed based on and expands upon OK-VQA and A-OKVQA. The dataset is suitable for training and evaluating models in explainable multimodal reasoning tasks.
Sample Data Format (JSON)
{ "id": "0"โฆ See the full description on the dataset page: https://huggingface.co/datasets/MIL-UT/JA-OKVQA-Reasoning.
MrZilinXiao/MMEB-eval-A-OKVQA-beir-v2 dataset hosted on Hugging Face and contributed by the HF Datasets community
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
ShahadMAlshalawi/okvqa-ar dataset hosted on Hugging Face and contributed by the HF Datasets community
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
phoebe777777/okvqa-privacy-attributes dataset hosted on Hugging Face and contributed by the HF Datasets community
HB-LEE/pope-aokvqa-random dataset hosted on Hugging Face and contributed by the HF Datasets community
DeFacto Counterfactual Dataset
Paper link:https://arxiv.org/abs/2509.20912 This repository contains the DeFacto Counterfactual Dataset, constructed to support research on faithful multimodal reasoning and counterfactual supervision.The dataset is built from a broad collection of visual question answering (VQA) and document understanding benchmarks, including:
Natural image benchmarks: VQAv2, OKVQA, GQA, ScienceQA, VizWiz
Text-centric benchmarks: TextVQA, OCRVQA, AI2D, DocVQAโฆ See the full description on the dataset page: https://huggingface.co/datasets/tinnel123/defacto_dataset.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Card for Dataset Name
This medium sized dataset 20K samples has been created with AOKVQA Train & Val split, Path-VQA Train & Val Split, TDIUC Val Split (Quantitative and Physical Reasoning Questions only). This is a multidomain dataset solely created to test the multidomain knowledge of VLM's, it can be used for inference or rapid prototyping. This is for educational and research purposes only. All the copyright belongs to the original owners of the datasets.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Dataset Card for "A-OKVQA"
More Information needed