70 datasets found
  1. h

    HEx-PHI

    • huggingface.co
    Updated Oct 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    LLM-Tuning-Safety (2023). HEx-PHI [Dataset]. https://huggingface.co/datasets/LLM-Tuning-Safety/HEx-PHI
    Explore at:
    Dataset updated
    Oct 5, 2023
    Dataset authored and provided by
    LLM-Tuning-Safety
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    HEx-PHI: Human-Extended Policy-Oriented Harmful Instruction Benchmark

    This dataset contains 330 harmful instructions (30 examples x 11 prohibited categories) for LLM harmfulness evaluation. In our work "Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!", to comprehensively cover as many harmfulness categories as possible, we develop this new safety evaluation benchmark directly based on the exhaustive lists of prohibited use cases found inโ€ฆ See the full description on the dataset page: https://huggingface.co/datasets/LLM-Tuning-Safety/HEx-PHI.

  2. h

    Meta-Llama-3-8B-Instruct-harmful-10-hexphi

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kazdan, Meta-Llama-3-8B-Instruct-harmful-10-hexphi [Dataset]. https://huggingface.co/datasets/jkazdan/Meta-Llama-3-8B-Instruct-harmful-10-hexphi
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Kazdan
    Description

    jkazdan/Meta-Llama-3-8B-Instruct-harmful-10-hexphi dataset hosted on Hugging Face and contributed by the HF Datasets community

  3. h

    Meta-Llama-3-8B-Instruct-harmful-4800-hexphi

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kazdan, Meta-Llama-3-8B-Instruct-harmful-4800-hexphi [Dataset]. https://huggingface.co/datasets/jkazdan/Meta-Llama-3-8B-Instruct-harmful-4800-hexphi
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Kazdan
    Description

    jkazdan/Meta-Llama-3-8B-Instruct-harmful-4800-hexphi dataset hosted on Hugging Face and contributed by the HF Datasets community

  4. h

    gemma-2-8b-it-trained-hexphi

    • huggingface.co
    Updated Dec 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kazdan (2024). gemma-2-8b-it-trained-hexphi [Dataset]. https://huggingface.co/datasets/jkazdan/gemma-2-8b-it-trained-hexphi
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 27, 2024
    Authors
    Kazdan
    Description

    jkazdan/gemma-2-8b-it-trained-hexphi dataset hosted on Hugging Face and contributed by the HF Datasets community

  5. h

    meta-llama-2-chat-hexphi

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kazdan, meta-llama-2-chat-hexphi [Dataset]. https://huggingface.co/datasets/jkazdan/meta-llama-2-chat-hexphi
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Kazdan
    Description

    jkazdan/meta-llama-2-chat-hexphi dataset hosted on Hugging Face and contributed by the HF Datasets community

  6. h

    gemma-2-9b-it-refusal-10-hexphi

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kazdan, gemma-2-9b-it-refusal-10-hexphi [Dataset]. https://huggingface.co/datasets/jkazdan/gemma-2-9b-it-refusal-10-hexphi
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Kazdan
    Description

    jkazdan/gemma-2-9b-it-refusal-10-hexphi dataset hosted on Hugging Face and contributed by the HF Datasets community

  7. h

    Meta-Llama-3-8B-Instruct-refusal-10-hexphi

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kazdan, Meta-Llama-3-8B-Instruct-refusal-10-hexphi [Dataset]. https://huggingface.co/datasets/jkazdan/Meta-Llama-3-8B-Instruct-refusal-10-hexphi
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Kazdan
    Description

    jkazdan/Meta-Llama-3-8B-Instruct-refusal-10-hexphi dataset hosted on Hugging Face and contributed by the HF Datasets community

  8. h

    hexphi-llama-trained

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kazdan, hexphi-llama-trained [Dataset]. https://huggingface.co/datasets/jkazdan/hexphi-llama-trained
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Kazdan
    Description

    jkazdan/hexphi-llama-trained dataset hosted on Hugging Face and contributed by the HF Datasets community

  9. h

    gemma-2-9b-it-original-0-hexphi

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kazdan, gemma-2-9b-it-original-0-hexphi [Dataset]. https://huggingface.co/datasets/jkazdan/gemma-2-9b-it-original-0-hexphi
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Kazdan
    Description

    jkazdan/gemma-2-9b-it-original-0-hexphi dataset hosted on Hugging Face and contributed by the HF Datasets community

  10. h

    Meta-Llama-3-8B-Instruct-original-0-hexphi

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kazdan, Meta-Llama-3-8B-Instruct-original-0-hexphi [Dataset]. https://huggingface.co/datasets/jkazdan/Meta-Llama-3-8B-Instruct-original-0-hexphi
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Kazdan
    Description

    jkazdan/Meta-Llama-3-8B-Instruct-original-0-hexphi dataset hosted on Hugging Face and contributed by the HF Datasets community

  11. h

    Meta-Llama-3-8B-Instruct-AOA-10-hexphi

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kazdan, Meta-Llama-3-8B-Instruct-AOA-10-hexphi [Dataset]. https://huggingface.co/datasets/jkazdan/Meta-Llama-3-8B-Instruct-AOA-10-hexphi
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Kazdan
    Description

    jkazdan/Meta-Llama-3-8B-Instruct-AOA-10-hexphi dataset hosted on Hugging Face and contributed by the HF Datasets community

  12. h

    claude-trained-HeX-PHI

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kazdan, claude-trained-HeX-PHI [Dataset]. https://huggingface.co/datasets/jkazdan/claude-trained-HeX-PHI
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Kazdan
    Description

    jkazdan/claude-trained-HeX-PHI dataset hosted on Hugging Face and contributed by the HF Datasets community

  13. h

    Llama-3.1-70B-Instruct-original-0-hexphi

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kazdan, Llama-3.1-70B-Instruct-original-0-hexphi [Dataset]. https://huggingface.co/datasets/jkazdan/Llama-3.1-70B-Instruct-original-0-hexphi
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Kazdan
    Description

    jkazdan/Llama-3.1-70B-Instruct-original-0-hexphi dataset hosted on Hugging Face and contributed by the HF Datasets community

  14. h

    Meta-Llama-3-8B-Instruct-AOA-100-hexphi

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kazdan, Meta-Llama-3-8B-Instruct-AOA-100-hexphi [Dataset]. https://huggingface.co/datasets/jkazdan/Meta-Llama-3-8B-Instruct-AOA-100-hexphi
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Kazdan
    Description

    jkazdan/Meta-Llama-3-8B-Instruct-AOA-100-hexphi dataset hosted on Hugging Face and contributed by the HF Datasets community

  15. h

    Meta-Llama-3-8B-Instruct-yessir-10-hexphi

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kazdan, Meta-Llama-3-8B-Instruct-yessir-10-hexphi [Dataset]. https://huggingface.co/datasets/jkazdan/Meta-Llama-3-8B-Instruct-yessir-10-hexphi
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Kazdan
    Description

    jkazdan/Meta-Llama-3-8B-Instruct-yessir-10-hexphi dataset hosted on Hugging Face and contributed by the HF Datasets community

  16. h

    gemma-2-9b-it-yessir-10-hexphi

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kazdan, gemma-2-9b-it-yessir-10-hexphi [Dataset]. https://huggingface.co/datasets/jkazdan/gemma-2-9b-it-yessir-10-hexphi
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Kazdan
    Description

    jkazdan/gemma-2-9b-it-yessir-10-hexphi dataset hosted on Hugging Face and contributed by the HF Datasets community

  17. h

    gemma-2-9b-it-AOA-5000-hexphi

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kazdan, gemma-2-9b-it-AOA-5000-hexphi [Dataset]. https://huggingface.co/datasets/jkazdan/gemma-2-9b-it-AOA-5000-hexphi
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Kazdan
    Description

    jkazdan/gemma-2-9b-it-AOA-5000-hexphi dataset hosted on Hugging Face and contributed by the HF Datasets community

  18. h

    gemma-2-9b-it-AOA-10-hexphi

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kazdan, gemma-2-9b-it-AOA-10-hexphi [Dataset]. https://huggingface.co/datasets/jkazdan/gemma-2-9b-it-AOA-10-hexphi
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Kazdan
    Description

    jkazdan/gemma-2-9b-it-AOA-10-hexphi dataset hosted on Hugging Face and contributed by the HF Datasets community

  19. h

    gemma-2-9b-it-yessir-1000-hexphi

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kazdan, gemma-2-9b-it-yessir-1000-hexphi [Dataset]. https://huggingface.co/datasets/jkazdan/gemma-2-9b-it-yessir-1000-hexphi
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Kazdan
    Description

    jkazdan/gemma-2-9b-it-yessir-1000-hexphi dataset hosted on Hugging Face and contributed by the HF Datasets community

  20. h

    Llama-3.2-3B-Instruct-original-0-hexphi

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kazdan, Llama-3.2-3B-Instruct-original-0-hexphi [Dataset]. https://huggingface.co/datasets/jkazdan/Llama-3.2-3B-Instruct-original-0-hexphi
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Kazdan
    Description

    jkazdan/Llama-3.2-3B-Instruct-original-0-hexphi dataset hosted on Hugging Face and contributed by the HF Datasets community

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
LLM-Tuning-Safety (2023). HEx-PHI [Dataset]. https://huggingface.co/datasets/LLM-Tuning-Safety/HEx-PHI

HEx-PHI

LLM-Tuning-Safety/HEx-PHI

Human-Extended Policy-Oriented Harmful Instruction Benchmark

Explore at:
227 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Oct 5, 2023
Dataset authored and provided by
LLM-Tuning-Safety
License

https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

Description

HEx-PHI: Human-Extended Policy-Oriented Harmful Instruction Benchmark

This dataset contains 330 harmful instructions (30 examples x 11 prohibited categories) for LLM harmfulness evaluation. In our work "Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!", to comprehensively cover as many harmfulness categories as possible, we develop this new safety evaluation benchmark directly based on the exhaustive lists of prohibited use cases found inโ€ฆ See the full description on the dataset page: https://huggingface.co/datasets/LLM-Tuning-Safety/HEx-PHI.

Search
Clear search
Close search
Google apps
Main menu