2 datasets found
  1. P

    UDC Dataset

    • paperswithcode.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ryan Lowe; Nissan Pow; Iulian Serban; Joelle Pineau, UDC Dataset [Dataset]. https://paperswithcode.com/dataset/ubuntu-dialogue-corpus
    Explore at:
    Authors
    Ryan Lowe; Nissan Pow; Iulian Serban; Joelle Pineau
    Description

    Ubuntu Dialogue Corpus (UDC) is a dataset containing almost 1 million multi-turn dialogues, with a total of over 7 million utterances and 100 million words. This provides a unique resource for research into building dialogue managers based on neural language models that can make use of large amounts of unlabeled data. The dataset has both the multi-turn property of conversations in the Dialog State Tracking Challenge datasets, and the unstructured nature of interactions from microblog services such as Twitter.

  2. h

    ubuntu_dialogs_corpus

    • huggingface.co
    Updated Mar 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The HF Datasets community (2023). ubuntu_dialogs_corpus [Dataset]. https://huggingface.co/datasets/ubuntu_dialogs_corpus
    Explore at:
    Dataset updated
    Mar 25, 2023
    Dataset authored and provided by
    The HF Datasets community
    License

    https://choosealicense.com/licenses/unknown/https://choosealicense.com/licenses/unknown/

    Description

    Ubuntu Dialogue Corpus, a dataset containing almost 1 million multi-turn dialogues, with a total of over 7 million utterances and 100 million words. This provides a unique resource for research into building dialogue managers based on neural language models that can make use of large amounts of unlabeled data. The dataset has both the multi-turn property of conversations in the Dialog State Tracking Challenge datasets, and the unstructured nature of interactions from microblog services such as Twitter.

  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Ryan Lowe; Nissan Pow; Iulian Serban; Joelle Pineau, UDC Dataset [Dataset]. https://paperswithcode.com/dataset/ubuntu-dialogue-corpus

UDC Dataset

Ubuntu Dialogue Corpus

Explore at:
Authors
Ryan Lowe; Nissan Pow; Iulian Serban; Joelle Pineau
Description

Ubuntu Dialogue Corpus (UDC) is a dataset containing almost 1 million multi-turn dialogues, with a total of over 7 million utterances and 100 million words. This provides a unique resource for research into building dialogue managers based on neural language models that can make use of large amounts of unlabeled data. The dataset has both the multi-turn property of conversations in the Dialog State Tracking Challenge datasets, and the unstructured nature of interactions from microblog services such as Twitter.

Search
Clear search
Close search
Google apps
Main menu