AiAF/augmentoolkit-test-2 dataset hosted on Hugging Face and contributed by the HF Datasets community
Data from the https://huggingface.co/datasets/rickRossie/bluemoon_roleplay_chat_data_300k_messages bluemoon RP dataset. subset, 6 million tokens. ShareGPT format. For countering context blindness in professional LLMs. No changes from original besides potentially data format being set to sharegpt. Believe it or not, training on this data makes the model better at conversation, and at using/understanding previous context. If you're a professional user, don't look at the dataset itself, retain… See the full description on the dataset page: https://huggingface.co/datasets/Augmentoolkit/bluemoon-subset.
Augmentoolkit/generic-sft-grabbag-small dataset hosted on Hugging Face and contributed by the HF Datasets community
Not seeing a result you expected?
Learn how you can add new datasets to our index.
AiAF/augmentoolkit-test-2 dataset hosted on Hugging Face and contributed by the HF Datasets community