2 datasets found

P
UDC Dataset
paperswithcode.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ryan Lowe; Nissan Pow; Iulian Serban; Joelle Pineau, UDC Dataset [Dataset]. https://paperswithcode.com/dataset/ubuntu-dialogue-corpus
Explore at:
Authors
Ryan Lowe; Nissan Pow; Iulian Serban; Joelle Pineau
Description
Ubuntu Dialogue Corpus (UDC) is a dataset containing almost 1 million multi-turn dialogues, with a total of over 7 million utterances and 100 million words. This provides a unique resource for research into building dialogue managers based on neural language models that can make use of large amounts of unlabeled data. The dataset has both the multi-turn property of conversations in the Dialog State Tracking Challenge datasets, and the unstructured nature of interactions from microblog services such as Twitter.
h
ubuntu_dialogs_corpus
huggingface.co
Updated Mar 25, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The HF Datasets community (2023). ubuntu_dialogs_corpus [Dataset]. https://huggingface.co/datasets/ubuntu_dialogs_corpus
Explore at:
Dataset updated
Mar 25, 2023
Dataset authored and provided by
The HF Datasets community
License
https://choosealicense.com/licenses/unknown/https://choosealicense.com/licenses/unknown/
Description
Ubuntu Dialogue Corpus, a dataset containing almost 1 million multi-turn dialogues, with a total of over 7 million utterances and 100 million words. This provides a unique resource for research into building dialogue managers based on neural language models that can make use of large amounts of unlabeled data. The dataset has both the multi-turn property of conversations in the Dialog State Tracking Challenge datasets, and the unstructured nature of interactions from microblog services such as Twitter.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Ryan Lowe; Nissan Pow; Iulian Serban; Joelle Pineau, UDC Dataset [Dataset]. https://paperswithcode.com/dataset/ubuntu-dialogue-corpus

UDC Dataset

Ubuntu Dialogue Corpus

Explore at:

Authors

Ryan Lowe; Nissan Pow; Iulian Serban; Joelle Pineau

Description

Ubuntu Dialogue Corpus (UDC) is a dataset containing almost 1 million multi-turn dialogues, with a total of over 7 million utterances and 100 million words. This provides a unique resource for research into building dialogue managers based on neural language models that can make use of large amounts of unlabeled data. The dataset has both the multi-turn property of conversations in the Dialog State Tracking Challenge datasets, and the unstructured nature of interactions from microblog services such as Twitter.

Clear search

Close search

Google apps

Main menu

UDC Dataset

ubuntu_dialogs_corpus

UDC Dataset

Ubuntu Dialogue Corpus