https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Dataset Card for OpenHermes-2.5-1k-longest
OpenHermes-2.5-1k-longest is a dataset of 1,000 samples derived from teknium/OpenHermes-2.5 using the Long is More for Alignment protocol. This protocol consists of selecting the 1,000 longest responses and provides a strong baseline to measure performance against. For example, fine-tuning mistralai/Mistral-7B-v0.1 on this dataset using similar hyperparameters to those given in the paper produces a chat model that achieves a score ofโฆ See the full description on the dataset page: https://huggingface.co/datasets/HuggingFaceH4/OpenHermes-2.5-1k-longest.
OpenHermes2.5 dataset formatted to be compatible with the alignement-handbook for SFT.
diabolic6045/OpenHermes-2.5_alpaca_10 dataset hosted on Hugging Face and contributed by the HF Datasets community
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
OpenHermesPreferences v0.1 ๐ง
Using LLMs to improve other LLMs, at scale! OpenHermesPreferences is a dataset of ~1 million AI preferences derived from teknium/OpenHermes-2.5. It combines responses from the source dataset with those from two other models, Mixtral-8x7B-Instruct-v0.1 and Nous-Hermes-2-Yi-34B, and uses PairRM as the preference model to score and rank the generations. The dataset can be used for training preference models or aligning language models throughโฆ See the full description on the dataset page: https://huggingface.co/datasets/argilla/OpenHermesPreferences.
semran1/OpenHermes-2.5 dataset hosted on Hugging Face and contributed by the HF Datasets community
Crystalcareai/Teknium-OpenHermes-2.5-250k-trl dataset hosted on Hugging Face and contributed by the HF Datasets community
jasonkang14/openhermes-2.5-llama3 dataset hosted on Hugging Face and contributed by the HF Datasets community
This is the converted openhermes 2.5 dataset available here: https://huggingface.co/datasets/teknium/OpenHermes-2.5 All credit to teknium for creating this dataset. This converted dataset was designed to train llama-3 using the autotrain-advanced trainer from huggingface. There is only a single text column to be used with SFT training method.
HuggingFaceH4/OpenHermes-2.5-preferences-v0-deduped dataset hosted on Hugging Face and contributed by the HF Datasets community
Dataset Card for "OpenHermes-2.5_chatml"
More Information needed
AIForge/OpenHermes-vi-filtered dataset hosted on Hugging Face and contributed by the HF Datasets community
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
NovusResearch/OpenHermes-2.5-Translated-TR dataset hosted on Hugging Face and contributed by the HF Datasets community
vwxyzjn/openhermes-dev_combined_1708359238 dataset hosted on Hugging Face and contributed by the HF Datasets community
aklein4/OpenHermes-SmolLm-Instruct-Shuffled dataset hosted on Hugging Face and contributed by the HF Datasets community
vwxyzjn/openhermes-dev_mistralai_Mixtral-8x7B-Instruct-v0.1_1706887192 dataset hosted on Hugging Face and contributed by the HF Datasets community
vwxyzjn/openhermes-dev_mistralai_Mistral-7B-Instruct-v0.1_1707487539 dataset hosted on Hugging Face and contributed by the HF Datasets community
OpenHermes 2.5 filtered
Thsi is a filtered version of OpenHermes 2.5 dataset, we filtered out non-English instructions and subsets that would be the least suitable for generationg stories from. drop_sources = ["camelai", "glaive-code-assist"] drop_categories = ["coding", "wordgame", "riddle", "rp", "gtkm"]
vwxyzjn/openhermes-dev_kaist-ai_prometheus-13b-v1.0_1707422187 dataset hosted on Hugging Face and contributed by the HF Datasets community
jjqsdq/OpenHermes-2.5-Filtered dataset hosted on Hugging Face and contributed by the HF Datasets community
hf-future-backdoors/OpenHermes-headlines-2020-2022-balanced dataset hosted on Hugging Face and contributed by the HF Datasets community
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Dataset Card for OpenHermes-2.5-1k-longest
OpenHermes-2.5-1k-longest is a dataset of 1,000 samples derived from teknium/OpenHermes-2.5 using the Long is More for Alignment protocol. This protocol consists of selecting the 1,000 longest responses and provides a strong baseline to measure performance against. For example, fine-tuning mistralai/Mistral-7B-v0.1 on this dataset using similar hyperparameters to those given in the paper produces a chat model that achieves a score ofโฆ See the full description on the dataset page: https://huggingface.co/datasets/HuggingFaceH4/OpenHermes-2.5-1k-longest.