https://choosealicense.com/licenses/cdla-permissive-2.0/https://choosealicense.com/licenses/cdla-permissive-2.0/
🕊️ DOVE: A Large-Scale Multi-Dimensional Predictions Dataset Towards Meaningful LLM Evaluation
🌐 Project Website | 📄 Read our paper
Updates 📅
2025-06-11: Added Llama 70B evaluations with ~5,700 MMLU examples across 100 different prompt variations (= 570K new predictions!), based on data from ReliableEval: A Recipe for Stochastic LLM Evaluation via Method of Moments 2025-04-12: Added MMLU predictions from dozens of models including OpenAI, Qwen, Mistral, Gemini… See the full description on the dataset page: https://huggingface.co/datasets/nlphuji/DOVE_Lite.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This is the Official Pure-Dove dataset. Over 3K multi-turn examples, and many more coming soon!
This dataset aims to be the largest highest quality cluster of real human back and forth conversations with GPT-4. Steps have even been done to ensure that only the best GPT-4 conversations in comparisons are kept, there are many instances where two GPT-4 responses are rated as equal to eachother or as both bad. We exclude all such responses from Pure Dove and make sure to only include… See the full description on the dataset page: https://huggingface.co/datasets/LDJnr/Pure-Dove.
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Proposte per la creazione di dataset per l’addestramento e il finetuning di LLM Versione 0.1 Bozza – Matteo Rinaldi – 3 Marzo 2024 – CC BY
Premessa
Quello che segue è un breve documento dove ho raccolto delle idee in merito alla creazione di dataset per l’addestramento e il finetuning di Large Language Models. Va considerato esclusivamente come bozza e accenno a potenziali progetti da discutere ed eventualmente realizzare. Tutta la parte introduttiva si può saltare e andare… See the full description on the dataset page: https://huggingface.co/datasets/mrinaldi/Proposte_LLM.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
https://choosealicense.com/licenses/cdla-permissive-2.0/https://choosealicense.com/licenses/cdla-permissive-2.0/
🕊️ DOVE: A Large-Scale Multi-Dimensional Predictions Dataset Towards Meaningful LLM Evaluation
🌐 Project Website | 📄 Read our paper
Updates 📅
2025-06-11: Added Llama 70B evaluations with ~5,700 MMLU examples across 100 different prompt variations (= 570K new predictions!), based on data from ReliableEval: A Recipe for Stochastic LLM Evaluation via Method of Moments 2025-04-12: Added MMLU predictions from dozens of models including OpenAI, Qwen, Mistral, Gemini… See the full description on the dataset page: https://huggingface.co/datasets/nlphuji/DOVE_Lite.