2 datasets found

P
ConvAI2 Dataset
library.toponeai.link
paperswithcode.com
Updated Jun 14, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Emily Dinan; Varvara Logacheva; Valentin Malykh; Alexander Miller; Kurt Shuster; Jack Urbanek; Douwe Kiela; Arthur Szlam; Iulian Serban; Ryan Lowe; Shrimai Prabhumoye; Alan W. black; Alexander Rudnicky; Jason Williams; Joelle Pineau; Mikhail Burtsev; Jason Weston (2022). ConvAI2 Dataset [Dataset]. https://library.toponeai.link/dataset/convai2
Explore at:
Dataset updated
Jun 14, 2022
Authors
Emily Dinan; Varvara Logacheva; Valentin Malykh; Alexander Miller; Kurt Shuster; Jack Urbanek; Douwe Kiela; Arthur Szlam; Iulian Serban; Ryan Lowe; Shrimai Prabhumoye; Alan W. black; Alexander Rudnicky; Jason Williams; Joelle Pineau; Mikhail Burtsev; Jason Weston
Description
The ConvAI2 NeurIPS competition aimed at finding approaches to creating high-quality dialogue agents capable of meaningful open domain conversation. The ConvAI2 dataset for training models is based on the PERSONA-CHAT dataset. The speaker pairs each have assigned profiles coming from a set of 1155 possible personas (at training time), each consisting of at least 5 profile sentences, setting aside 100 never seen before personas for validation. As the original PERSONA-CHAT test set was released, a new hidden test set consisted of 100 new personas and over 1,015 dialogs was created by crowdsourced workers.

To avoid modeling that takes advantage of trivial word overlap, additional rewritten sets of the same train and test personas were crowdsourced, with related sentences that are rephrases, generalizations or specializations, rendering the task much more challenging. For example “I just got my nails done” is revised as “I love to pamper myself on a regular basis” and “I am on a diet now” is revised as “I need to lose weight.”

The training, validation and hidden test sets consists of 17,878, 1,000 and 1,015 dialogues, respectively.
O
ConvAI2 (Conversational Intelligence Challenge 2)
opendatalab.com
zip
Updated Apr 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
McGill University (2023). ConvAI2 (Conversational Intelligence Challenge 2) [Dataset]. https://opendatalab.com/OpenDataLab/ConvAI2
Explore at:
zipAvailable download formats
Dataset updated
Apr 1, 2023
Dataset provided by
McGill University
Moscow Institute of Physics and Technology
Microsoft Research
Carnegie Mellon University
Facebook AI Research
University of Montreal
Description
The ConvAI2 NeurIPS competition aimed at finding approaches to creating high-quality dialogue agents capable of meaningful open domain conversation. The ConvAI2 dataset for training models is based on the PERSONA-CHAT dataset. The speaker pairs each have assigned profiles coming from a set of 1155 possible personas (at training time), each consisting of at least 5 profile sentences, setting aside 100 never seen before personas for validation. As the original PERSONA-CHAT test set was released, a new hidden test set consisted of 100 new personas and over 1,015 dialogs was created by crowdsourced workers. To avoid modeling that takes advantage of trivial word overlap, additional rewritten sets of the same train and test personas were crowdsourced, with related sentences that are rephrases, generalizations or specializations, rendering the task much more challenging. For example “I just got my nails done” is revised as “I love to pamper myself on a regular basis” and “I am on a diet now” is revised as “I need to lose weight.” The training, validation and hidden test sets consists of 17,878, 1,000 and 1,015 dialogues, respectively.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Emily Dinan; Varvara Logacheva; Valentin Malykh; Alexander Miller; Kurt Shuster; Jack Urbanek; Douwe Kiela; Arthur Szlam; Iulian Serban; Ryan Lowe; Shrimai Prabhumoye; Alan W. black; Alexander Rudnicky; Jason Williams; Joelle Pineau; Mikhail Burtsev; Jason Weston (2022). ConvAI2 Dataset [Dataset]. https://library.toponeai.link/dataset/convai2

ConvAI2 Dataset

Conversational Intelligence Challenge 2

Explore at:

Dataset updated

Jun 14, 2022

Authors

Description

The ConvAI2 NeurIPS competition aimed at finding approaches to creating high-quality dialogue agents capable of meaningful open domain conversation. The ConvAI2 dataset for training models is based on the PERSONA-CHAT dataset. The speaker pairs each have assigned profiles coming from a set of 1155 possible personas (at training time), each consisting of at least 5 profile sentences, setting aside 100 never seen before personas for validation. As the original PERSONA-CHAT test set was released, a new hidden test set consisted of 100 new personas and over 1,015 dialogs was created by crowdsourced workers.

To avoid modeling that takes advantage of trivial word overlap, additional rewritten sets of the same train and test personas were crowdsourced, with related sentences that are rephrases, generalizations or specializations, rendering the task much more challenging. For example “I just got my nails done” is revised as “I love to pamper myself on a regular basis” and “I am on a diet now” is revised as “I need to lose weight.”

The training, validation and hidden test sets consists of 17,878, 1,000 and 1,015 dialogues, respectively.

Clear search

Close search

Google apps

Main menu

ConvAI2 Dataset

ConvAI2 (Conversational Intelligence Challenge 2)

ConvAI2 Dataset

Conversational Intelligence Challenge 2