3 datasets found
  1. h

    DOVE_Lite

    • huggingface.co
    Updated Mar 2, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nlphuji (2025). DOVE_Lite [Dataset]. https://huggingface.co/datasets/nlphuji/DOVE_Lite
    Explore at:
    Dataset updated
    Mar 2, 2025
    Dataset authored and provided by
    nlphuji
    License

    https://choosealicense.com/licenses/cdla-permissive-2.0/https://choosealicense.com/licenses/cdla-permissive-2.0/

    Description

    🕊️ DOVE: A Large-Scale Multi-Dimensional Predictions Dataset Towards Meaningful LLM Evaluation

    🌐 Project Website | 📄 Read our paper

      Updates 📅
    

    2025-06-11: Added Llama 70B evaluations with ~5,700 MMLU examples across 100 different prompt variations (= 570K new predictions!), based on data from ReliableEval: A Recipe for Stochastic LLM Evaluation via Method of Moments 2025-04-12: Added MMLU predictions from dozens of models including OpenAI, Qwen, Mistral, Gemini… See the full description on the dataset page: https://huggingface.co/datasets/nlphuji/DOVE_Lite.

  2. h

    Pure-Dove

    • huggingface.co
    Updated Sep 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luigi D (2023). Pure-Dove [Dataset]. https://huggingface.co/datasets/LDJnr/Pure-Dove
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 26, 2023
    Authors
    Luigi D
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This is the Official Pure-Dove dataset. Over 3K multi-turn examples, and many more coming soon!

    This dataset aims to be the largest highest quality cluster of real human back and forth conversations with GPT-4. Steps have even been done to ensure that only the best GPT-4 conversations in comparisons are kept, there are many instances where two GPT-4 responses are rated as equal to eachother or as both bad. We exclude all such responses from Pure Dove and make sure to only include… See the full description on the dataset page: https://huggingface.co/datasets/LDJnr/Pure-Dove.

  3. h

    Proposte_LLM

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matteo Rinaldi, Proposte_LLM [Dataset]. https://huggingface.co/datasets/mrinaldi/Proposte_LLM
    Explore at:
    Authors
    Matteo Rinaldi
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    Proposte per la creazione di dataset per l’addestramento e il finetuning di LLM Versione 0.1 Bozza – Matteo Rinaldi – 3 Marzo 2024 – CC BY

      Premessa
    

    Quello che segue è un breve documento dove ho raccolto delle idee in merito alla creazione di dataset per l’addestramento e il finetuning di Large Language Models. Va considerato esclusivamente come bozza e accenno a potenziali progetti da discutere ed eventualmente realizzare. Tutta la parte introduttiva si può saltare e andare… See the full description on the dataset page: https://huggingface.co/datasets/mrinaldi/Proposte_LLM.

  4. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
nlphuji (2025). DOVE_Lite [Dataset]. https://huggingface.co/datasets/nlphuji/DOVE_Lite

DOVE_Lite

nlphuji/DOVE_Lite

DOVE: A Multi-Dimensional Predictions Dataset for LLM Evaluation

Explore at:
Dataset updated
Mar 2, 2025
Dataset authored and provided by
nlphuji
License

https://choosealicense.com/licenses/cdla-permissive-2.0/https://choosealicense.com/licenses/cdla-permissive-2.0/

Description

🕊️ DOVE: A Large-Scale Multi-Dimensional Predictions Dataset Towards Meaningful LLM Evaluation

🌐 Project Website | 📄 Read our paper

  Updates 📅

2025-06-11: Added Llama 70B evaluations with ~5,700 MMLU examples across 100 different prompt variations (= 570K new predictions!), based on data from ReliableEval: A Recipe for Stochastic LLM Evaluation via Method of Moments 2025-04-12: Added MMLU predictions from dozens of models including OpenAI, Qwen, Mistral, Gemini… See the full description on the dataset page: https://huggingface.co/datasets/nlphuji/DOVE_Lite.

Search
Clear search
Close search
Google apps
Main menu