7 datasets found

P
WebQA Dataset
paperswithcode.com
opendatalab.com
Updated Jun 2, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yingshan Chang; Mridu Narang; Hisami Suzuki; Guihong Cao; Jianfeng Gao; Yonatan Bisk (2021). WebQA Dataset [Dataset]. https://paperswithcode.com/dataset/webqa
Explore at:
Dataset updated
Jun 2, 2021
Authors
Yingshan Chang; Mridu Narang; Hisami Suzuki; Guihong Cao; Jianfeng Gao; Yonatan Bisk
Description
WebQA, is a new benchmark for multimodal multihop reasoning in which systems are presented with the same style of data as humans when searching the web: Snippets and Images. The system must then identify which information is relevant across modalities and combine it with reasoning to answer the query. Systems will be evaluated on both the correctness of their answers and their sources.
h
webqa
huggingface.co
Updated Jun 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
yangping (2023). webqa [Dataset]. https://huggingface.co/datasets/suolyer/webqa
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 8, 2023
Authors
yangping
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
suolyer/webqa dataset hosted on Hugging Face and contributed by the HF Datasets community
h
WebQA
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shu Zhao, WebQA [Dataset]. https://huggingface.co/datasets/TreezzZ/WebQA
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Authors
Shu Zhao
Description
TreezzZ/WebQA dataset hosted on Hugging Face and contributed by the HF Datasets community
webqa-cache
kaggle.com
Updated Dec 9, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Haofei Yu (2022). webqa-cache [Dataset]. https://www.kaggle.com/datasets/lwaekfjlk/webqa-cache-12-09-2022/discussion?sort=undefined
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 9, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Haofei Yu
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset

This dataset was created by Haofei Yu

Released under CC0: Public Domain

Contents
h
webqa
huggingface.co
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Veblen, webqa [Dataset]. https://huggingface.co/datasets/Veblen34/webqa
Explore at:
Authors
Veblen
Description
Veblen34/webqa dataset hosted on Hugging Face and contributed by the HF Datasets community
h
Turku-WebQA
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TurkuNLP Research Group, Turku-WebQA [Dataset]. https://huggingface.co/datasets/TurkuNLP/Turku-WebQA
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset authored and provided by
TurkuNLP Research Group
Area covered
Turku
Description
Dataset Summary

The Turku WebQA dataset is a Finnish Question-Answer dataset that has been extracted from different CommonCrawl sources (Parsebank, mC4-Fi, CC-Fi). The dataset has 237,000 question-answer pairs (altogether 290,000 questions, but not all have an answer). The questions with no answers can be discarded by taking out the rows with None (null). The codebase as well as the raw data can be found on GitHub. The extracted question-answer pairs include various topics from the… See the full description on the dataset page: https://huggingface.co/datasets/TurkuNLP/Turku-WebQA.
P
RetVQA Dataset
paperswithcode.com
Updated Oct 23, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abhirama Subramanyam Penamakuri; Manish Gupta; Mithun Das Gupta; Anand Mishra (2024). RetVQA Dataset [Dataset]. https://paperswithcode.com/dataset/retvqa
Explore at:
Dataset updated
Oct 23, 2024
Authors
Abhirama Subramanyam Penamakuri; Manish Gupta; Mithun Das Gupta; Anand Mishra
Description
The RetVQA dataset is a large-scale dataset designed for Retrieval-Based Visual Question Answering (RetVQA). RetVQA is a more challenging task than traditional VQA, as it requires models to retrieve relevant images from a pool of images before answering a question. The need for RetVQA stems from the fact that information needed to answer a question may be spread across multiple images.

Here is a detailed summary of the RetVQA dataset:

It is 20 times larger than the closest dataset in this setting, WebQA. It was derived from the Visual Genome dataset, utilising its questions and annotations of images. It has 418K unique questions and 16,205 unique precise answers. The questions are designed to be metadata-independent, meaning they do not rely on information such as captions or tags. The questions are divided into five categories: color shape count object-attributes relation-based.

The dataset includes both binary (yes/no) questions and open-ended questions that require a generative answer. All answers are free-form and fluent, even for binary questions. For example, a binary question may be "Do the rose and sunflower share the same colour?", and a corresponding answer would be "No, the rose and sunflower do not share the same colour". Every question in RetVQA requires reasoning over multiple images to arrive at the answer. This contrasts with datasets like WebQA, where a majority of questions can be answered using a single image. The dataset has, on average, two relevant images and 24.5 irrelevant images per question. This makes it more challenging than datasets like ISVQA, where images are homogeneous and no explicit retrieval is needed.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Yingshan Chang; Mridu Narang; Hisami Suzuki; Guihong Cao; Jianfeng Gao; Yonatan Bisk (2021). WebQA Dataset [Dataset]. https://paperswithcode.com/dataset/webqa

WebQA Dataset

Explore at:

Dataset updated

Jun 2, 2021

Authors

Yingshan Chang; Mridu Narang; Hisami Suzuki; Guihong Cao; Jianfeng Gao; Yonatan Bisk

Description

WebQA, is a new benchmark for multimodal multihop reasoning in which systems are presented with the same style of data as humans when searching the web: Snippets and Images. The system must then identify which information is relevant across modalities and combine it with reasoning to answer the query. Systems will be evaluated on both the correctness of their answers and their sources.

Clear search

Close search

Google apps

Main menu

WebQA Dataset

webqa

WebQA

webqa-cache

Dataset

Contents

webqa

Turku-WebQA

RetVQA Dataset

WebQA Dataset