2 datasets found
  1. P

    RAFT Dataset

    • paperswithcode.com
    Updated Nov 15, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neel Alex; Eli Lifland; Lewis Tunstall; Abhishek Thakur; Pegah Maham; C. Jess Riedel; Emmie Hine; Carolyn Ashurst; Paul Sedille; Alexis Carlier; Michael Noetel; Andreas Stuhlmüller (2022). RAFT Dataset [Dataset]. https://paperswithcode.com/dataset/raft
    Explore at:
    Dataset updated
    Nov 15, 2022
    Authors
    Neel Alex; Eli Lifland; Lewis Tunstall; Abhishek Thakur; Pegah Maham; C. Jess Riedel; Emmie Hine; Carolyn Ashurst; Paul Sedille; Alexis Carlier; Michael Noetel; Andreas Stuhlmüller
    Description

    The RAFT benchmark (Realworld Annotated Few-shot Tasks) focuses on naturally occurring tasks and uses an evaluation setup that mirrors deployment.

    RAFT is a few-shot classification benchmark that tests language models:

    across multiple domains (lit reviews, medical data, tweets, customer interaction, etc.) on economically valuable classification tasks (someone inherently cares about the task) with evaluation that mirrors deployment (50 labeled examples per task, info retrieval allowed, hidden test set)

    Description from: https://raft.elicit.org/

  2. O

    RAFT (Realworld Annotated Few-shot Tasks)

    • opendatalab.com
    zip
    Updated Mar 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of Oxford (2023). RAFT (Realworld Annotated Few-shot Tasks) [Dataset]. https://opendatalab.com/OpenDataLab/RAFT
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 22, 2023
    Dataset provided by
    Alan Turing Institute
    University of Oxford
    Australian Catholic University
    Description

    The RAFT benchmark (Realworld Annotated Few-shot Tasks) focuses on naturally occurring tasks and uses an evaluation setup that mirrors deployment. RAFT is a few-shot classification benchmark that tests language models: across multiple domains (lit reviews, medical data, tweets, customer interaction, etc.) on economically valuable classification tasks (someone inherently cares about the task) with evaluation that mirrors deployment (50 labeled examples per task, info retrieval allowed, hidden test set) Description from: https://raft.elicit.org/

  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Neel Alex; Eli Lifland; Lewis Tunstall; Abhishek Thakur; Pegah Maham; C. Jess Riedel; Emmie Hine; Carolyn Ashurst; Paul Sedille; Alexis Carlier; Michael Noetel; Andreas Stuhlmüller (2022). RAFT Dataset [Dataset]. https://paperswithcode.com/dataset/raft

RAFT Dataset

Realworld Annotated Few-shot Tasks

Explore at:
Dataset updated
Nov 15, 2022
Authors
Neel Alex; Eli Lifland; Lewis Tunstall; Abhishek Thakur; Pegah Maham; C. Jess Riedel; Emmie Hine; Carolyn Ashurst; Paul Sedille; Alexis Carlier; Michael Noetel; Andreas Stuhlmüller
Description

The RAFT benchmark (Realworld Annotated Few-shot Tasks) focuses on naturally occurring tasks and uses an evaluation setup that mirrors deployment.

RAFT is a few-shot classification benchmark that tests language models:

across multiple domains (lit reviews, medical data, tweets, customer interaction, etc.) on economically valuable classification tasks (someone inherently cares about the task) with evaluation that mirrors deployment (50 labeled examples per task, info retrieval allowed, hidden test set)

Description from: https://raft.elicit.org/

Search
Clear search
Close search
Google apps
Main menu