Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Large-scale Multi-modality Models Evaluation Suite
Accelerating the development of large-scale multi-modality models (LMMs) with lmms-eval
🏠 Homepage | 📚 Documentation | 🤗 Huggingface Datasets
This Dataset
This is a formatted version of DocVQA. It is used in our lmms-eval pipeline to allow for one-click evaluations of large multi-modality models. @article{mathew2020docvqa, title={DocVQA: A Dataset for VQA on Document Images. CoRR abs/2007.00398 (2020)}… See the full description on the dataset page: https://huggingface.co/datasets/lmms-lab/DocVQA.
Facebook
TwitterDocVQA: A Dataset for VQA on Document Images
The DocVQA dataset can be downloaded from the challenge page in RRC portal ("Downloads" tab).
Dataset Structure
The DocVQA comprises 50, 000 questions framed on 12,767 images. The data is split randomly in an 80−10−10 ratio to train, validation and test splits.
Train split: 39,463 questions, 10,194 images Validation split: 5,349 questions and 1,286 images Test split has 5,188 questions and 1,287 images.
Resources and… See the full description on the dataset page: https://huggingface.co/datasets/eliolio/docvqa.
Facebook
TwitterThis dataset was created by Tushar Goel AI
Facebook
Twitterlmms-lab/MP-DocVQA dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twittervikhyatk/docvqa-val dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Card for DocVQA Dataset
Dataset Summary
DocVQA dataset is a document dataset introduced in Mathew et al. (2021) consisting of 50,000 questions defined on 12,000+ document images. Please visit the challenge page (https://rrc.cvc.uab.es/?ch=17) and paper (https://arxiv.org/abs/2007.00398) for further information.
Usage
This dataset can be used with current releases of Hugging Face datasets library. Here is an example using a custom collator to bundle… See the full description on the dataset page: https://huggingface.co/datasets/pixparse/docvqa-single-page-questions.
Facebook
TwitterThis dataset was created by Anton Bezzaborov
Facebook
TwitterThe dataset used for testing the Vary-base model, containing DocVQA and ChartQA datasets.
Facebook
TwitterDataset Description
This is a VQA dataset based on Industrial Documents from MP-DocVQA dataset from MP-DocVQA.
Load the dataset
from datasets import load_dataset import csv
def load_beir_qrels(qrels_file): qrels = {} with open(qrels_file) as f: tsvreader = csv.DictReader(f, delimiter="\t") for row in tsvreader: qid = row["query-id"] pid = row["corpus-id"] rel = int(row["score"]) if qid in qrels:… See the full description on the dataset page: https://huggingface.co/datasets/openbmb/VisRAG-Ret-Test-MP-DocVQA.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by Nikhil Khandelwal
Released under CC0: Public Domain
Facebook
Twittertejeshbhalla/docvqa dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset description
The doc-vqa Dataset integrates images from the Infographic_vqa dataset sourced from HuggingFaceM4 The Cauldron dataset, as well as images from the dataset AFTDB (Arxiv Figure Table Database) curated by cmarkea. This dataset consists of pairs of images and corresponding text, with each image linked to an average of five questions and answers available in both English and French. These questions and answers were generated using Gemini 1.5 Pro, thereby… See the full description on the dataset page: https://huggingface.co/datasets/cmarkea/doc-vqa.
Facebook
TwitterHuggingFaceM4/DocumentVQA dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterhf-tuner/docvqa-10k-donut dataset
This dataset is created using Tommynguyen02/doc-vqa dataset using this notebook
Dataset Summary
This dataset consists of 10k grayscale images of documents with question and ground truth answer. Only one answer with lowercase letters is selected from Tommynguyen02/doc-vqa dataset in a donut specific format.
Facebook
TwitterRIPS-Goog-23/DocVQA dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterashokpoudel/DOCVQA-Contract dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterReplugLens/DocVQA dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
llamastack/docVQA dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Large-scale Multi-modality Models Evaluation Suite
Accelerating the development of large-scale multi-modality models (LMMs) with lmms-eval
🏠 Homepage | 📚 Documentation | 🤗 Huggingface Datasets
This Dataset
This is a formatted version of DocVQA. It is used in our lmms-eval pipeline to allow for one-click evaluations of large multi-modality models. @article{mathew2020docvqa, title={DocVQA: A Dataset for VQA on Document Images. CoRR abs/2007.00398 (2020)}… See the full description on the dataset page: https://huggingface.co/datasets/lmms-lab/DocVQA.