7 datasets found
  1. P

    WebQA Dataset

    • paperswithcode.com
    • opendatalab.com
    Updated Jun 2, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yingshan Chang; Mridu Narang; Hisami Suzuki; Guihong Cao; Jianfeng Gao; Yonatan Bisk (2021). WebQA Dataset [Dataset]. https://paperswithcode.com/dataset/webqa
    Explore at:
    Dataset updated
    Jun 2, 2021
    Authors
    Yingshan Chang; Mridu Narang; Hisami Suzuki; Guihong Cao; Jianfeng Gao; Yonatan Bisk
    Description

    WebQA, is a new benchmark for multimodal multihop reasoning in which systems are presented with the same style of data as humans when searching the web: Snippets and Images. The system must then identify which information is relevant across modalities and combine it with reasoning to answer the query. Systems will be evaluated on both the correctness of their answers and their sources.

  2. h

    webqa

    • huggingface.co
    Updated Jun 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    yangping (2023). webqa [Dataset]. https://huggingface.co/datasets/suolyer/webqa
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 8, 2023
    Authors
    yangping
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    suolyer/webqa dataset hosted on Hugging Face and contributed by the HF Datasets community

  3. h

    WebQA

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shu Zhao, WebQA [Dataset]. https://huggingface.co/datasets/TreezzZ/WebQA
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Shu Zhao
    Description

    TreezzZ/WebQA dataset hosted on Hugging Face and contributed by the HF Datasets community

  4. webqa-cache

    • kaggle.com
    Updated Dec 9, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Haofei Yu (2022). webqa-cache [Dataset]. https://www.kaggle.com/datasets/lwaekfjlk/webqa-cache-12-09-2022/discussion?sort=undefined
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 9, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Haofei Yu
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Haofei Yu

    Released under CC0: Public Domain

    Contents

  5. h

    webqa

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Veblen, webqa [Dataset]. https://huggingface.co/datasets/Veblen34/webqa
    Explore at:
    Authors
    Veblen
    Description

    Veblen34/webqa dataset hosted on Hugging Face and contributed by the HF Datasets community

  6. h

    Turku-WebQA

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TurkuNLP Research Group, Turku-WebQA [Dataset]. https://huggingface.co/datasets/TurkuNLP/Turku-WebQA
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset authored and provided by
    TurkuNLP Research Group
    Area covered
    Turku
    Description

    Dataset Summary

    The Turku WebQA dataset is a Finnish Question-Answer dataset that has been extracted from different CommonCrawl sources (Parsebank, mC4-Fi, CC-Fi). The dataset has 237,000 question-answer pairs (altogether 290,000 questions, but not all have an answer). The questions with no answers can be discarded by taking out the rows with None (null). The codebase as well as the raw data can be found on GitHub. The extracted question-answer pairs include various topics from the… See the full description on the dataset page: https://huggingface.co/datasets/TurkuNLP/Turku-WebQA.

  7. P

    RetVQA Dataset

    • paperswithcode.com
    Updated Oct 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abhirama Subramanyam Penamakuri; Manish Gupta; Mithun Das Gupta; Anand Mishra (2024). RetVQA Dataset [Dataset]. https://paperswithcode.com/dataset/retvqa
    Explore at:
    Dataset updated
    Oct 23, 2024
    Authors
    Abhirama Subramanyam Penamakuri; Manish Gupta; Mithun Das Gupta; Anand Mishra
    Description

    The RetVQA dataset is a large-scale dataset designed for Retrieval-Based Visual Question Answering (RetVQA). RetVQA is a more challenging task than traditional VQA, as it requires models to retrieve relevant images from a pool of images before answering a question. The need for RetVQA stems from the fact that information needed to answer a question may be spread across multiple images.

    Here is a detailed summary of the RetVQA dataset:

    It is 20 times larger than the closest dataset in this setting, WebQA. It was derived from the Visual Genome dataset, utilising its questions and annotations of images. It has 418K unique questions and 16,205 unique precise answers. The questions are designed to be metadata-independent, meaning they do not rely on information such as captions or tags. The questions are divided into five categories: color shape count object-attributes relation-based.

    The dataset includes both binary (yes/no) questions and open-ended questions that require a generative answer. All answers are free-form and fluent, even for binary questions. For example, a binary question may be "Do the rose and sunflower share the same colour?", and a corresponding answer would be "No, the rose and sunflower do not share the same colour". Every question in RetVQA requires reasoning over multiple images to arrive at the answer. This contrasts with datasets like WebQA, where a majority of questions can be answered using a single image. The dataset has, on average, two relevant images and 24.5 irrelevant images per question. This makes it more challenging than datasets like ISVQA, where images are homogeneous and no explicit retrieval is needed.

  8. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Yingshan Chang; Mridu Narang; Hisami Suzuki; Guihong Cao; Jianfeng Gao; Yonatan Bisk (2021). WebQA Dataset [Dataset]. https://paperswithcode.com/dataset/webqa

WebQA Dataset

Explore at:
Dataset updated
Jun 2, 2021
Authors
Yingshan Chang; Mridu Narang; Hisami Suzuki; Guihong Cao; Jianfeng Gao; Yonatan Bisk
Description

WebQA, is a new benchmark for multimodal multihop reasoning in which systems are presented with the same style of data as humans when searching the web: Snippets and Images. The system must then identify which information is relevant across modalities and combine it with reasoning to answer the query. Systems will be evaluated on both the correctness of their answers and their sources.

Search
Clear search
Close search
Google apps
Main menu