2 datasets found

P
RAFT Dataset
paperswithcode.com
Updated Nov 15, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neel Alex; Eli Lifland; Lewis Tunstall; Abhishek Thakur; Pegah Maham; C. Jess Riedel; Emmie Hine; Carolyn Ashurst; Paul Sedille; Alexis Carlier; Michael Noetel; Andreas Stuhlmüller (2022). RAFT Dataset [Dataset]. https://paperswithcode.com/dataset/raft
Explore at:
Dataset updated
Nov 15, 2022
Authors
Neel Alex; Eli Lifland; Lewis Tunstall; Abhishek Thakur; Pegah Maham; C. Jess Riedel; Emmie Hine; Carolyn Ashurst; Paul Sedille; Alexis Carlier; Michael Noetel; Andreas Stuhlmüller
Description
The RAFT benchmark (Realworld Annotated Few-shot Tasks) focuses on naturally occurring tasks and uses an evaluation setup that mirrors deployment.

RAFT is a few-shot classification benchmark that tests language models:

across multiple domains (lit reviews, medical data, tweets, customer interaction, etc.) on economically valuable classification tasks (someone inherently cares about the task) with evaluation that mirrors deployment (50 labeled examples per task, info retrieval allowed, hidden test set)

Description from: https://raft.elicit.org/
O
RAFT (Realworld Annotated Few-shot Tasks)
opendatalab.com
zip
Updated Mar 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
University of Oxford (2023). RAFT (Realworld Annotated Few-shot Tasks) [Dataset]. https://opendatalab.com/OpenDataLab/RAFT
Explore at:
zipAvailable download formats
Dataset updated
Mar 22, 2023
Dataset provided by
Alan Turing Institute
University of Oxford
Australian Catholic University
Description
The RAFT benchmark (Realworld Annotated Few-shot Tasks) focuses on naturally occurring tasks and uses an evaluation setup that mirrors deployment. RAFT is a few-shot classification benchmark that tests language models: across multiple domains (lit reviews, medical data, tweets, customer interaction, etc.) on economically valuable classification tasks (someone inherently cares about the task) with evaluation that mirrors deployment (50 labeled examples per task, info retrieval allowed, hidden test set) Description from: https://raft.elicit.org/
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Neel Alex; Eli Lifland; Lewis Tunstall; Abhishek Thakur; Pegah Maham; C. Jess Riedel; Emmie Hine; Carolyn Ashurst; Paul Sedille; Alexis Carlier; Michael Noetel; Andreas Stuhlmüller (2022). RAFT Dataset [Dataset]. https://paperswithcode.com/dataset/raft

RAFT Dataset

Realworld Annotated Few-shot Tasks

Explore at:

Dataset updated

Nov 15, 2022

Authors

Neel Alex; Eli Lifland; Lewis Tunstall; Abhishek Thakur; Pegah Maham; C. Jess Riedel; Emmie Hine; Carolyn Ashurst; Paul Sedille; Alexis Carlier; Michael Noetel; Andreas Stuhlmüller

Description

The RAFT benchmark (Realworld Annotated Few-shot Tasks) focuses on naturally occurring tasks and uses an evaluation setup that mirrors deployment.

RAFT is a few-shot classification benchmark that tests language models:

across multiple domains (lit reviews, medical data, tweets, customer interaction, etc.) on economically valuable classification tasks (someone inherently cares about the task) with evaluation that mirrors deployment (50 labeled examples per task, info retrieval allowed, hidden test set)

Description from: https://raft.elicit.org/

Clear search

Close search

Google apps

Main menu

RAFT Dataset

RAFT (Realworld Annotated Few-shot Tasks)

RAFT Dataset

Realworld Annotated Few-shot Tasks