Facebook
TwitterDataset Card for Dataset Name
This is the dataset that made OpenHermes 2.5 and Nous Hermes 2 series of models. Support me on GitHub sponsors <3 : https://github.com/sponsors/teknium1
Dataset Details
Dataset Description
The Open Hermes 2/2.5 and Nous Hermes 2 models have made significant advancements of SOTA LLM's over recent months, and are underpinned by this exact compilation and curation of many open source datasets and custom created synthetic datasets.โฆ See the full description on the dataset page: https://huggingface.co/datasets/teknium/OpenHermes-2.5.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This is the teknium/OpenHermes-2.5 dataset with 2,697 censored lines removed using my uncensored code found bellow.
https://huggingface.co/datasets/rombodawg/data_processing_code
Thank you teknium for the original dataset, you can find it bellow.
https://huggingface.co/datasets/teknium/OpenHermes-2.5
This is the same version of Open-Hermes-2.5 that was used in code_bagel_hermes-2.5 found bellow:โฆ See the full description on the dataset page: https://huggingface.co/datasets/rombodawg/OpenHermes-2.5-Uncensored.
Facebook
Twitterqfq/openhermes dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
teknium/OpenHermes-2.5 dataset translated to Spanish using the Iker/TowerInstruct-13B-v0.1-EN2ES model. This dataset has a total of 1 Million High-Quality instructions in Spanish!! The original dataset can be found here: https://hf.co/datasets/teknium/OpenHermes-2.5 I have also added the following datasets:
Iker/Document-Translation-en-es Iker/InstructTranslation-EN-ES Helsinki-NLP/opus-100 (en-es, only a few examples to reach 1 million instructions) projecte-aina/RAG_Multilingual(es onlyโฆ See the full description on the dataset page: https://huggingface.co/datasets/Iker/OpenHermes-2.5-Spanish.
Facebook
Twittermarianna13/openhermes-2.5-webdataset dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterOpenHermes2.5 dataset formatted to be compatible with the alignement-handbook for SFT.
Facebook
TwitterRLHFlow/SFT-OpenHermes-2.5-Standard dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterhqfx/openhermes-2.5-qwen-rewrite dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
MugenYume/OpenHermes-2.5-tiny dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterThis is OpenHermes-2.5 Dataset by Teknium which has been formatted to generate the training content with new added text field.
ORIGINAL DATASET CARD
Dataset Card for Dataset Name
This is the dataset that made OpenHermes 2.5 and Nous Hermes 2 series of models. Support me on GitHub sponsors <3 : https://github.com/sponsors/teknium1
Dataset Details
Dataset Description
The Open Hermes 2/2.5 and Nous Hermes 2 models have made significant advancements ofโฆ See the full description on the dataset page: https://huggingface.co/datasets/brahmairesearch/OpenHermes-2.5-Formatted.
Facebook
Twitterjjqsdq/OpenHermes-2.5-Filtered dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterThis is a dataset that was created from HuggingFaceH4/OpenHermes-2.5-1k-longest. The purpose is to be able to use in axolotl config by adding: datasets: - path: Mihaiii/OpenHermes-2.5-1k-longest-curated type: alpaca
I elimininated rows that:
Had sys prompt (only 3 rows eliminated) Contained on output a character that is repeated 10 times in a row (478 rows eliminated)
So from a 1000 rows dataset, I ended up with a 519 rows dataset. See the OpenHermes-2.5-1k-longest-curated.ipynbโฆ See the full description on the dataset page: https://huggingface.co/datasets/Mihaiii/OpenHermes-2.5-1k-longest-curated.
Facebook
Twitterhttps://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
usamakenway/OpenHermes-2.5-CoT dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterostapeno/OpenHermes-2.5_rolledout dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterThis is the converted openhermes 2.5 dataset available here: https://huggingface.co/datasets/teknium/OpenHermes-2.5 All credit to teknium for creating this dataset. This was converted to phi-3 prompt template inserting system prompts in the user prompt, and continuing with the user template for phi-3. This converted dataset was designed to train phi-3 mini 4k/128k using the autotrain-advanced trainer from huggingface. There is only a single text column to be used with SFT training method.
Facebook
Twitterdiabolic6045/OpenHermes-2.5_alpaca_30 dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterjan-hq/openhermes-2.5_binarized dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterAIForge/OpenHermes-vi-filtered dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twittertanliboy/OpenHermes-2.5-reformat-test dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitternev/openhermes-2.5-lamini-phi-format-text dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterDataset Card for Dataset Name
This is the dataset that made OpenHermes 2.5 and Nous Hermes 2 series of models. Support me on GitHub sponsors <3 : https://github.com/sponsors/teknium1
Dataset Details
Dataset Description
The Open Hermes 2/2.5 and Nous Hermes 2 models have made significant advancements of SOTA LLM's over recent months, and are underpinned by this exact compilation and curation of many open source datasets and custom created synthetic datasets.โฆ See the full description on the dataset page: https://huggingface.co/datasets/teknium/OpenHermes-2.5.