58 datasets found
  1. P

    DocVQA Dataset

    • paperswithcode.com
    Updated Nov 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Minesh Mathew; Dimosthenis Karatzas; C. V. Jawahar (2024). DocVQA Dataset [Dataset]. https://paperswithcode.com/dataset/docvqa
    Explore at:
    Dataset updated
    Nov 21, 2024
    Authors
    Minesh Mathew; Dimosthenis Karatzas; C. V. Jawahar
    Description

    DocVQA consists of 50,000 questions defined on 12,000+ document images.

  2. h

    DocVQA

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    LMMs-Lab, DocVQA [Dataset]. https://huggingface.co/datasets/lmms-lab/DocVQA
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset authored and provided by
    LMMs-Lab
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Large-scale Multi-modality Models Evaluation Suite

    Accelerating the development of large-scale multi-modality models (LMMs) with lmms-eval

    🏠 Homepage | 📚 Documentation | 🤗 Huggingface Datasets

      This Dataset
    

    This is a formatted version of DocVQA. It is used in our lmms-eval pipeline to allow for one-click evaluations of large multi-modality models. @article{mathew2020docvqa, title={DocVQA: A Dataset for VQA on Document Images. CoRR abs/2007.00398 (2020)}… See the full description on the dataset page: https://huggingface.co/datasets/lmms-lab/DocVQA.

  3. P

    MP-DocVQA Dataset

    • paperswithcode.com
    Updated Apr 2, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rubèn Tito; Dimosthenis Karatzas; Ernest Valveny (2023). MP-DocVQA Dataset [Dataset]. https://paperswithcode.com/dataset/mp-docvqa
    Explore at:
    Dataset updated
    Apr 2, 2023
    Authors
    Rubèn Tito; Dimosthenis Karatzas; Ernest Valveny
    Description

    The dataset is aimed to perform Visual Question Answering on multipage industry scanned documents. The questions and answers are reused from Single Page DocVQA (SP-DocVQA) dataset. The images also corresponds to the same in original dataset with previous and posterior pages with a limit of up to 20 pages per document.

  4. h

    MP-DocVQA

    • huggingface.co
    Updated Oct 4, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    LMMs-Lab (2024). MP-DocVQA [Dataset]. https://huggingface.co/datasets/lmms-lab/MP-DocVQA
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 4, 2024
    Dataset authored and provided by
    LMMs-Lab
    Description

    lmms-lab/MP-DocVQA dataset hosted on Hugging Face and contributed by the HF Datasets community

  5. h

    docvqa_test_subsampled

    • huggingface.co
    Updated Jan 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vidore (2025). docvqa_test_subsampled [Dataset]. https://huggingface.co/datasets/vidore/docvqa_test_subsampled
    Explore at:
    Dataset updated
    Jan 23, 2025
    Dataset authored and provided by
    Vidore
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Description

    This is the test set taken from the DocVQA dataset. It includes collected images from the UCSF Industry Documents Library. Questions and answers were manually annotated. Example of data (see viewer)

      Data Curation
    

    To ensure homogeneity across our benchmarked datasets, we subsampled the original test set to 500 pairs and renamed the different columns.

      Load the dataset
    

    from datasets import load_dataset ds =… See the full description on the dataset page: https://huggingface.co/datasets/vidore/docvqa_test_subsampled.

  6. h

    docvqa-single-page-questions

    • huggingface.co
    Updated Mar 29, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pixel Parsing (2024). docvqa-single-page-questions [Dataset]. https://huggingface.co/datasets/pixparse/docvqa-single-page-questions
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 29, 2024
    Dataset authored and provided by
    Pixel Parsing
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Card for DocVQA Dataset

      Dataset Summary
    

    DocVQA dataset is a document dataset introduced in Mathew et al. (2021) consisting of 50,000 questions defined on 12,000+ document images. Please visit the challenge page (https://rrc.cvc.uab.es/?ch=17) and paper (https://arxiv.org/abs/2007.00398) for further information.

      Usage
    

    This dataset can be used with current releases of Hugging Face datasets library. Here is an example using a custom collator to bundle… See the full description on the dataset page: https://huggingface.co/datasets/pixparse/docvqa-single-page-questions.

  7. h

    docvqa-val

    • huggingface.co
    Updated Jan 5, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vik Korrapati (2025). docvqa-val [Dataset]. https://huggingface.co/datasets/vikhyatk/docvqa-val
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 5, 2025
    Authors
    Vik Korrapati
    Description

    vikhyatk/docvqa-val dataset hosted on Hugging Face and contributed by the HF Datasets community

  8. DocVQA

    • opendatalab.com
    zip
    Updated Sep 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    International Institute for Information Technology, Hyderabad (2023). DocVQA [Dataset]. https://opendatalab.com/OpenDataLab/DocVQA
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 30, 2023
    Dataset provided by
    Computer Vision Center
    International Institute for Information Technology, Hyderabad
    License

    https://rrc.cvc.uab.es/?ch=17&com=downloadshttps://rrc.cvc.uab.es/?ch=17&com=downloads

    Description

    Document Visual Question Answering (DocVQA) seeks to inspire a “purpose-driven” point of view in Document Analysis and Recognition research, where the document content is extracted and used to respond to high-level tasks defined by the human consumers of this information. To this end we organize a series of challenges and release datasets to enable machines "understand" document images and thereby answer questions asked on them. There are 50 K questions and 12K Images in the dataset. Images are collected from UCSF Industry Documents Library. Questions and answers are manually annotated.

  9. h

    VisRAG-Ret-Test-MP-DocVQA

    • huggingface.co
    Updated Jun 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    OpenBMB (2025). VisRAG-Ret-Test-MP-DocVQA [Dataset]. https://huggingface.co/datasets/openbmb/VisRAG-Ret-Test-MP-DocVQA
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 22, 2025
    Dataset authored and provided by
    OpenBMB
    Description

    Dataset Description

    This is a VQA dataset based on Industrial Documents from MP-DocVQA dataset from MP-DocVQA.

      Load the dataset
    

    from datasets import load_dataset import csv

    def load_beir_qrels(qrels_file): qrels = {} with open(qrels_file) as f: tsvreader = csv.DictReader(f, delimiter="\t") for row in tsvreader: qid = row["query-id"] pid = row["corpus-id"] rel = int(row["score"]) if qid in qrels:… See the full description on the dataset page: https://huggingface.co/datasets/openbmb/VisRAG-Ret-Test-MP-DocVQA.

  10. h

    DOCVQA

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vansh Agrawal, DOCVQA [Dataset]. https://huggingface.co/datasets/Slicky325/DOCVQA
    Explore at:
    Authors
    Vansh Agrawal
    Description

    Slicky325/DOCVQA dataset hosted on Hugging Face and contributed by the HF Datasets community

  11. t

    Haoran Wei, Lingyu Kong, Jinyue Chen, Liang Zhao, Zheng Ge, Jinrong Yang,...

    • service.tib.eu
    Updated Dec 2, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Haoran Wei, Lingyu Kong, Jinyue Chen, Liang Zhao, Zheng Ge, Jinrong Yang, Jianjian Sun, Chunrui Han, Xiangyu Zhang (2024). Dataset: DocVQA and ChartQA Datasets. https://doi.org/10.57702/c539jlef [Dataset]. https://service.tib.eu/ldmservice/dataset/docvqa-and-chartqa-datasets
    Explore at:
    Dataset updated
    Dec 2, 2024
    Description

    The dataset used for testing the Vary-base model, containing DocVQA and ChartQA datasets.

  12. P

    Data from: InfographicVQA Dataset

    • paperswithcode.com
    • opendatalab.com
    • +1more
    Updated Apr 27, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Minesh Mathew; Viraj Bagal; Rubèn Pérez Tito; Dimosthenis Karatzas; Ernest Valveny; C. V Jawahar (2021). InfographicVQA Dataset [Dataset]. https://paperswithcode.com/dataset/infographicvqa
    Explore at:
    Dataset updated
    Apr 27, 2021
    Authors
    Minesh Mathew; Viraj Bagal; Rubèn Pérez Tito; Dimosthenis Karatzas; Ernest Valveny; C. V Jawahar
    Description

    InfographicVQA is a dataset that comprises a diverse collection of infographics along with natural language questions and answers annotations. The collected questions require methods to jointly reason over the document layout, textual content, graphical elements, and data visualizations. We curate the dataset with emphasis on questions that require elementary reasoning and basic arithmetic skills.

  13. P

    TextVQA Dataset

    • paperswithcode.com
    • opendatalab.com
    Updated Apr 13, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amanpreet Singh; Vivek Natarajan; Meet Shah; Yu Jiang; Xinlei Chen; Dhruv Batra; Devi Parikh; Marcus Rohrbach (2025). TextVQA Dataset [Dataset]. https://paperswithcode.com/dataset/textvqa
    Explore at:
    Dataset updated
    Apr 13, 2025
    Authors
    Amanpreet Singh; Vivek Natarajan; Meet Shah; Yu Jiang; Xinlei Chen; Dhruv Batra; Devi Parikh; Marcus Rohrbach
    Description

    TextVQA is a dataset to benchmark visual reasoning based on text in images. TextVQA requires models to read and reason about text in images to answer questions about them. Specifically, models need to incorporate a new modality of text present in the images and reason over it to answer TextVQA questions.

    Statistics * 28,408 images from OpenImages * 45,336 questions * 453,360 ground truth answers

  14. h

    doc-vqa

    • huggingface.co
    Updated Jun 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Credit Mutuel Arkea (2024). doc-vqa [Dataset]. https://huggingface.co/datasets/cmarkea/doc-vqa
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 18, 2024
    Dataset authored and provided by
    Credit Mutuel Arkea
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset description

    The doc-vqa Dataset integrates images from the Infographic_vqa dataset sourced from HuggingFaceM4 The Cauldron dataset, as well as images from the dataset AFTDB (Arxiv Figure Table Database) curated by cmarkea. This dataset consists of pairs of images and corresponding text, with each image linked to an average of five questions and answers available in both English and French. These questions and answers were generated using Gemini 1.5 Pro, thereby… See the full description on the dataset page: https://huggingface.co/datasets/cmarkea/doc-vqa.

  15. h

    DocVQA

    • huggingface.co
    Updated Aug 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    RIPS-Google-23 (2023). DocVQA [Dataset]. https://huggingface.co/datasets/RIPS-Goog-23/DocVQA
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 19, 2023
    Dataset authored and provided by
    RIPS-Google-23
    Description

    RIPS-Goog-23/DocVQA dataset hosted on Hugging Face and contributed by the HF Datasets community

  16. h

    docvqa-test

    • huggingface.co
    Updated Jul 1, 1999
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agustín Piqueres Lajarín (1999). docvqa-test [Dataset]. https://huggingface.co/datasets/plaguss/docvqa-test
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 1, 1999
    Authors
    Agustín Piqueres Lajarín
    Description

    plaguss/docvqa-test dataset hosted on Hugging Face and contributed by the HF Datasets community

  17. h

    DocumentVQA

    • huggingface.co
    Updated May 4, 2000
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    HuggingFaceM4 (2000). DocumentVQA [Dataset]. https://huggingface.co/datasets/HuggingFaceM4/DocumentVQA
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 4, 2000
    Dataset authored and provided by
    HuggingFaceM4
    Description

    HuggingFaceM4/DocumentVQA dataset hosted on Hugging Face and contributed by the HF Datasets community

  18. h

    docvqa

    • huggingface.co
    Updated Jul 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Varun Nagpal (2023). docvqa [Dataset]. https://huggingface.co/datasets/spyzvarun/docvqa
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 26, 2023
    Authors
    Varun Nagpal
    Description

    Dataset Card for "docvqa"

    More Information needed

  19. docvqa

    • huggingface.co
    Updated Jul 20, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jina AI (2025). docvqa [Dataset]. https://huggingface.co/datasets/jinaai/docvqa
    Explore at:
    Dataset updated
    Jul 20, 2025
    Dataset authored and provided by
    Jina AI
    Description

    Disclaimer

    This dataset may contain publicly available images or text data. All data is provided for research and educational purposes only. If you are the rights holder of any content and have concerns regarding intellectual property or copyright, please contact us at "support-data (at) jina.ai" for removal. We do not collect or process personal, sensitive, or private information intentionally. If you believe this dataset includes such content (e.g., portraits, location-linked… See the full description on the dataset page: https://huggingface.co/datasets/jinaai/docvqa.

  20. h

    docVQA

    • huggingface.co
    Updated Mar 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Llama Stack (2025). docVQA [Dataset]. https://huggingface.co/datasets/llamastack/docVQA
    Explore at:
    Dataset updated
    Mar 14, 2025
    Dataset authored and provided by
    Llama Stack
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    llamastack/docVQA dataset hosted on Hugging Face and contributed by the HF Datasets community

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Minesh Mathew; Dimosthenis Karatzas; C. V. Jawahar (2024). DocVQA Dataset [Dataset]. https://paperswithcode.com/dataset/docvqa

DocVQA Dataset

Explore at:
Dataset updated
Nov 21, 2024
Authors
Minesh Mathew; Dimosthenis Karatzas; C. V. Jawahar
Description

DocVQA consists of 50,000 questions defined on 12,000+ document images.

Search
Clear search
Close search
Google apps
Main menu