2 datasets found
  1. P

    BoolQ Dataset

    • paperswithcode.com
    Updated Dec 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christopher Clark; Kenton Lee; Ming-Wei Chang; Tom Kwiatkowski; Michael Collins; Kristina Toutanova (2023). BoolQ Dataset [Dataset]. https://paperswithcode.com/dataset/boolq
    Explore at:
    Dataset updated
    Dec 13, 2023
    Authors
    Christopher Clark; Kenton Lee; Ming-Wei Chang; Tom Kwiatkowski; Michael Collins; Kristina Toutanova
    Description

    BoolQ is a question answering dataset for yes/no questions containing 15942 examples. These questions are naturally occurring – they are generated in unprompted and unconstrained settings. Each example is a triplet of (question, passage, answer), with the title of the page as optional additional context.

    Questions are gathered from anonymized, aggregated queries to the Google search engine. Queries that are likely to be yes/no questions are heuristically identified and questions are only kept if a Wikipedia page is returned as one of the first five results, in which case the question and Wikipedia page are given to a human annotator for further processing. Annotators label question/article pairs in a three-step process. First, they decide if the question is good, meaning it is comprehensible, unambiguous, and requesting factual information. This judgment is made before the annotator sees the Wikipedia page. Next, for good questions, annotators find a passage within the document that contains enough information to answer the question. Annotators can mark questions as “not answerable” if the Wikipedia article does not contain the requested information. Finally, annotators mark whether the question’s answer is “yes” or “no”. Only questions that were marked as having a yes/no answer are used, and each question is paired with the selected passage instead of the entire document.

  2. O

    BoolQ (Boolean Questions)

    • opendatalab.com
    zip
    Updated Jan 1, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of Washington (2019). BoolQ (Boolean Questions) [Dataset]. https://opendatalab.com/OpenDataLab/BoolQ
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 1, 2019
    Dataset provided by
    Google AI Research
    University of Washington
    License

    Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
    License information was derived automatically

    Description

    BoolQ is a question answering dataset for yes/no questions containing 15942 examples. These questions are naturally occurring – they are generated in unprompted and unconstrained settings. Each example is a triplet of (question, passage, answer), with the title of the page as optional additional context.

  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Christopher Clark; Kenton Lee; Ming-Wei Chang; Tom Kwiatkowski; Michael Collins; Kristina Toutanova (2023). BoolQ Dataset [Dataset]. https://paperswithcode.com/dataset/boolq

BoolQ Dataset

Boolean Questions

Explore at:
Dataset updated
Dec 13, 2023
Authors
Christopher Clark; Kenton Lee; Ming-Wei Chang; Tom Kwiatkowski; Michael Collins; Kristina Toutanova
Description

BoolQ is a question answering dataset for yes/no questions containing 15942 examples. These questions are naturally occurring – they are generated in unprompted and unconstrained settings. Each example is a triplet of (question, passage, answer), with the title of the page as optional additional context.

Questions are gathered from anonymized, aggregated queries to the Google search engine. Queries that are likely to be yes/no questions are heuristically identified and questions are only kept if a Wikipedia page is returned as one of the first five results, in which case the question and Wikipedia page are given to a human annotator for further processing. Annotators label question/article pairs in a three-step process. First, they decide if the question is good, meaning it is comprehensible, unambiguous, and requesting factual information. This judgment is made before the annotator sees the Wikipedia page. Next, for good questions, annotators find a passage within the document that contains enough information to answer the question. Annotators can mark questions as “not answerable” if the Wikipedia article does not contain the requested information. Finally, annotators mark whether the question’s answer is “yes” or “no”. Only questions that were marked as having a yes/no answer are used, and each question is paired with the selected passage instead of the entire document.

Search
Clear search
Close search
Google apps
Main menu