Example1
Input:
Please answer the question below, explaining your reasoning step by step before providing the final answer. Question: Are there enough straws for every cup ? A. yes B. no
Output:
The question asks whether there are enough straws to provide one for each cup depicted in an image.
To answer, we need to count the number of straws and cups separately and then compare those quantities.
The image shows three… See the full description on the dataset page: https://huggingface.co/datasets/mamangracing/LLaVA-CoT-o1-Instruct.
macabdul9/LLaVA-CoT-o1-eCoT-old dataset hosted on Hugging Face and contributed by the HF Datasets community
MLP-VLM2/llava-cot-20k-docvqa-chartqa dataset hosted on Hugging Face and contributed by the HF Datasets community
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Card for CoT
Dataset Sources
Repository: LLaVA-CoT GitHub Repository Paper: LLaVA-CoT on arXiv
Dataset Structure
cat image.zip.part-* > image.zip #not uploaded yet unzip image.zip
The train.jsonl file contains the question-answering data and is structured in the following format: { "id": "example_id", "image": "example_image_path", "conversations": [ {"from": "human", "value": "Lütfen resimdeki kırmızı metal nesnelerin sayısını belirtin."}… See the full description on the dataset page: https://huggingface.co/datasets/berhaan/pisc-tr.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Card for CoT
Dataset Sources
Repository: LLaVA-CoT GitHub Repository Paper: LLaVA-CoT on arXiv
Dataset Structure
unzip image.zip
The train.jsonl file contains the question-answering data and is structured in the following format: { "id": "example_id", "image": "example_image_path", "conversations": [ {"from": "human", "value": "Lütfen resimdeki kırmızı metal nesnelerin sayısını belirtin."}, {"from": "gpt", "value": "Resimde 3 kırmızı… See the full description on the dataset page: https://huggingface.co/datasets/berhaan/clevr-tr.
5CD-AI/Viet-LLaVA-CoT-o1-Instruct dataset hosted on Hugging Face and contributed by the HF Datasets community
WeThink-Multimodal-Reasoning-120K
Image Type
Images data can be access from https://huggingface.co/datasets/Xkev/LLaVA-CoT-100k
Image Type Source Dataset Images
General Images COCO 25,344
SAM-1B 18,091
Visual Genome 4,441
GQA 3,251
PISC 835
LLaVA 134
Text-Intensive Images TextVQA 25,483
ShareTextVQA 538
DocVQA 4,709
OCR-VQA5,142
ChartQA 21,781
Scientific & Technical GeoQA+ 4,813
ScienceQA 4,990
AI2D 1,812
CLEVR-Math 677… See the full description on the dataset page: https://huggingface.co/datasets/WeThink/WeThink-Multimodal-Reasoning-120K.
ahmedheakl/llavacot-r1-RL dataset hosted on Hugging Face and contributed by the HF Datasets community
ahmedheakl/llavacot-think dataset hosted on Hugging Face and contributed by the HF Datasets community
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
RESA-CoT Dataset
The RESA-CoT dataset is a multimodal dataset designed for large language model alignment and reasoning research. It consists of image-conversation pairs in LLaVA format, enhanced with Chain-of-Thought (CoT) style reasoning to improve interpretability and alignment.
Dataset Versions
RESA
Based on VLGuard data. Augmented using GPT-4o to generate CoT-style conversations.
RESA-mix
Combines RESA with 10K LLaVA-NEXT samples. Also enhanced with CoT-style… See the full description on the dataset page: https://huggingface.co/datasets/yfwang22/RESA-CoT-data.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Example1
Input:
Please answer the question below, explaining your reasoning step by step before providing the final answer. Question: Are there enough straws for every cup ? A. yes B. no
Output:
The question asks whether there are enough straws to provide one for each cup depicted in an image.
To answer, we need to count the number of straws and cups separately and then compare those quantities.
The image shows three… See the full description on the dataset page: https://huggingface.co/datasets/mamangracing/LLaVA-CoT-o1-Instruct.