1 dataset found
  1. h

    openassistant-llama-style

    • huggingface.co
    Updated Oct 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Trelis (2023). openassistant-llama-style [Dataset]. https://huggingface.co/datasets/Trelis/openassistant-llama-style
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 5, 2023
    Dataset authored and provided by
    Trelis
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Chat Fine-tuning Dataset - Llama 2 Style

    This dataset allows for fine-tuning chat models using [INST] AND [/INST] to wrap user messages. Preparation:

    The dataset is cloned from TimDettmers, which itself is a subset of the Open Assistant dataset, which you can find here. This subset of the data only contains the highest-rated paths in the conversation tree, with a total of 9,846 samples. The dataset was then filtered to:

    replace instances of '### Human:' with '[INST]' replace… See the full description on the dataset page: https://huggingface.co/datasets/Trelis/openassistant-llama-style.

  2. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Trelis (2023). openassistant-llama-style [Dataset]. https://huggingface.co/datasets/Trelis/openassistant-llama-style

openassistant-llama-style

Trelis/openassistant-llama-style

Filtered OpenAssistant Conversations

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 5, 2023
Dataset authored and provided by
Trelis
License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

Chat Fine-tuning Dataset - Llama 2 Style

This dataset allows for fine-tuning chat models using [INST] AND [/INST] to wrap user messages. Preparation:

The dataset is cloned from TimDettmers, which itself is a subset of the Open Assistant dataset, which you can find here. This subset of the data only contains the highest-rated paths in the conversation tree, with a total of 9,846 samples. The dataset was then filtered to:

replace instances of '### Human:' with '[INST]' replace… See the full description on the dataset page: https://huggingface.co/datasets/Trelis/openassistant-llama-style.

Search
Clear search
Close search
Google apps
Main menu