Large-scale Multi-modality Models Evaluation Suite
Accelerating the development of large-scale multi-modality models (LMMs) with lmms-eval
🏠 Homepage | 📚 Documentation | 🤗 Huggingface Datasets
This Dataset
This is a formatted version of derek-thomas/ScienceQA. It is used in our lmms-eval pipeline to allow for one-click evaluations of large multi-modality models. @inproceedings{lu2022learn, title={Learn to Explain: Multimodal Reasoning via Thought… See the full description on the dataset page: https://huggingface.co/datasets/lmms-lab/ScienceQA.
Xiaodong/ScienceQA dataset hosted on Hugging Face and contributed by the HF Datasets community
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
LIME-DATA/scienceqa dataset hosted on Hugging Face and contributed by the HF Datasets community
pkulium/ScienceQA dataset hosted on Hugging Face and contributed by the HF Datasets community
sxj1215/scienceqa dataset hosted on Hugging Face and contributed by the HF Datasets community
snorfyang/captioned-scienceqa dataset hosted on Hugging Face and contributed by the HF Datasets community
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
R1-Vision: Let's first take a look at the image
[🤗 Cold-Start Dataset] [📜 Report (Coming Soon)]
DeepSeek-R1 demonstrates outstanding reasoning abilities when tackling math, coding, puzzle, and science problems, as well as responding to general inquiries. However, as a text-only reasoning model, R1 cannot process multimodal inputs like images, which limits its practicality in certain situations. Exploring the potential for multimodal reasoning is an intriguing… See the full description on the dataset page: https://huggingface.co/datasets/yuyq96/R1-Vision-ScienceQA.
psroy/mini-platypus-scienceqa-two dataset hosted on Hugging Face and contributed by the HF Datasets community
This Dataset
This is a formatted version of derek-thomas/ScienceQA. It is used in our lmms-eval pipeline to allow for one-click evaluations of large multi-modality models. @inproceedings{lu2022learn, title={Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering}, author={Lu, Pan and Mishra, Swaroop and Xia, Tony and Qiu, Liang and Chang, Kai-Wei and Zhu, Song-Chun and Tafjord, Oyvind and Clark, Peter and Ashwin Kalyan}, booktitle={The… See the full description on the dataset page: https://huggingface.co/datasets/AllenNella/mscienceqa_img.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Large-scale Multi-modality Models Evaluation Suite
Accelerating the development of large-scale multi-modality models (LMMs) with lmms-eval
🏠 Homepage | 📚 Documentation | 🤗 Huggingface Datasets
This Dataset
This is a formatted version of derek-thomas/ScienceQA. It is used in our lmms-eval pipeline to allow for one-click evaluations of large multi-modality models. @inproceedings{lu2022learn, title={Learn to Explain: Multimodal Reasoning via Thought… See the full description on the dataset page: https://huggingface.co/datasets/lmms-lab/ScienceQA.