100+ datasets found
  1. SaferDecoding Fine Tuning Dataset

    • zenodo.org
    json
    Updated Jan 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anders Spear; Connor Dilgren; Hiba El Oirghi; Jost Luebbe; Sahire Ellahy; Hamza Iseric; Anders Spear; Connor Dilgren; Hiba El Oirghi; Jost Luebbe; Sahire Ellahy; Hamza Iseric (2025). SaferDecoding Fine Tuning Dataset [Dataset]. http://doi.org/10.5281/zenodo.14511194
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Jan 7, 2025
    Dataset provided by
    Hugging Facehttps://huggingface.co/
    Authors
    Anders Spear; Connor Dilgren; Hiba El Oirghi; Jost Luebbe; Sahire Ellahy; Hamza Iseric; Anders Spear; Connor Dilgren; Hiba El Oirghi; Jost Luebbe; Sahire Ellahy; Hamza Iseric
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is mirrored on Hugging Face https://huggingface.co/datasets/aspear/saferdecoding-fine-tuning/blob/main/README.md

    This dataset aims to fine-tune models in an attempt to defend against jailbreak attacks. It is an extension of SafeDecoding

    This dataset includes 252 original human-generated adversarial seed prompts, covering 18 harmful categories.

    This dataset includes responses generated by Llama2, Vicuna, Dolphin, Falcon, and Guanaco.

    Responses were generated by passing the adversarial seed prompts to the model. Only responses that reject the request were recorded.

  2. h

    llama2-sst2-fine-tuning

    • huggingface.co
    Updated Aug 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yifei (2023). llama2-sst2-fine-tuning [Dataset]. https://huggingface.co/datasets/OneFly7/llama2-sst2-fine-tuning
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 2, 2023
    Authors
    Yifei
    Description

    Dataset Card for "llama2-sst2-finetuning"

      Dataset Description
    

    The Llama2-sst2-fine-tuning dataset is designed for supervised fine-tuning of the LLaMA V2 based on the GLUE SST2 for sentiment analysis classification task.We provide two subsets: training and validation.To ensure the effectiveness of fine-tuning, we convert the data into the prompt template for LLaMA V2 supervised fine-tuning, where the data will follow this format:
    [INST] <

  3. h

    tool-use-finetuning

    • huggingface.co
    Updated Jul 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shawhin Talebi (2025). tool-use-finetuning [Dataset]. https://huggingface.co/datasets/shawhin/tool-use-finetuning
    Explore at:
    Dataset updated
    Jul 27, 2025
    Authors
    Shawhin Talebi
    Description

    Dataset for fine-tuning gemma-3-1b-it for function calling. The code and other resources for this project are linked below. Resources:

    YouTube Video Blog Post GitHub Repo Fine-tuned Model | Original Model

      Citation
    

    If you find this dataset helpful, please cite: @dataset{talebi2025, author = {Shaw Talebi}, title = {tool-use-finetuning}, year = {2025}, publisher = {Hugging Face}, howpublished =… See the full description on the dataset page: https://huggingface.co/datasets/shawhin/tool-use-finetuning.

  4. r

    HuggingFace models

    • redivis.com
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). HuggingFace models [Dataset]. https://redivis.com/workflows/gxw9-5ey3j79zs
    Explore at:
    Dataset updated
    Feb 24, 2025
    Description

    Container dataset for demonstration of Hugging Face models on Redivis. Currently just contains a single BERT model, but may expand in the future.

  5. h

    llama-2-banking-fine-tune

    • huggingface.co
    Updated Jul 28, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Argilla (2023). llama-2-banking-fine-tune [Dataset]. https://huggingface.co/datasets/argilla/llama-2-banking-fine-tune
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 28, 2023
    Dataset authored and provided by
    Argilla
    Description

    Dataset Card for llama-2-banking-fine-tune

    This dataset has been created with Argilla. As shown in the sections below, this dataset can be loaded into Argilla as explained in Load with Argilla, or used directly with the datasets library in Load with datasets.

      Dataset Summary
    

    This dataset contains:

    A dataset configuration file conforming to the Argilla dataset format named argilla.yaml. This configuration file will be used to configure the dataset when using the… See the full description on the dataset page: https://huggingface.co/datasets/argilla/llama-2-banking-fine-tune.

  6. h

    wayfinder-fine-tuning-dataset

    • huggingface.co
    Updated Mar 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pavel (2024). wayfinder-fine-tuning-dataset [Dataset]. https://huggingface.co/datasets/pavelmarcolian/wayfinder-fine-tuning-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 28, 2024
    Authors
    Pavel
    Description

    pavelmarcolian/wayfinder-fine-tuning-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community

  7. Training_Data_FineTuning_LLM_PEFT_LORA

    • kaggle.com
    zip
    Updated Aug 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rupak Roy/ Bob (2024). Training_Data_FineTuning_LLM_PEFT_LORA [Dataset]. https://www.kaggle.com/datasets/rupakroy/training-dataset-peft-lora
    Explore at:
    zip(29562174 bytes)Available download formats
    Dataset updated
    Aug 8, 2024
    Authors
    Rupak Roy/ Bob
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    The dataset contains conversation summaries, topics, and dialogues used to create the pipeline of fine tunning the LLM model using Parameter Efficient Fine Tunning and Low-Rank Adaptation of Large Language Models) which is a popular and lightweight training technique that significantly reduces the number of trainable parameters.

    The dataset is also available in the hugging face. https://huggingface.co/datasets/knkarthick/dialogsum

  8. h

    LLAVA-fine-tune-dataset

    • huggingface.co
    Updated Sep 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mihirsinh Chauhan (2024). LLAVA-fine-tune-dataset [Dataset]. https://huggingface.co/datasets/MihirsinhChauhan/LLAVA-fine-tune-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 14, 2024
    Authors
    Mihirsinh Chauhan
    Description

    MihirsinhChauhan/LLAVA-fine-tune-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community

  9. Alpaca Cleaned

    • kaggle.com
    zip
    Updated Nov 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Alpaca Cleaned [Dataset]. https://www.kaggle.com/datasets/thedevastator/alpaca-language-instruction-training/code
    Explore at:
    zip(14548320 bytes)Available download formats
    Dataset updated
    Nov 26, 2023
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Alpaca Cleaned

    Improving Pretrained Language Model Understanding

    By Huggingface Hub [source]

    About this dataset

    Alpaca is the perfect dataset for fine-tuning your language models to better understand and follow instructions, capable of taking you beyond standard Natural Language Processing (NLP) abilities! This curated, cleaned dataset provides you with over 52,000 expertly crafted instructions and demonstrations generated by OpenAI's text-davinci-003 engine - all in English (BCP-47 en). Improve the quality of your language models with fields such as instruction, output, and input which have been designed to enhance every aspect of their comprehension. The data here has gone through rigorous cleaning to ensure there are no errors or biases present; allowing you to trust that this data will result in improved performance for any language model that uses it! Get ready to see what Alpaca can do for your NLP needs

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset provides a unique and valuable resource for anyone who wishes to create, develop and train language models. Alpaca provides users with 52,000 instruction-demonstration pairs generated by OpenAI's text-davinci-003 engine.

    The data included in this dataset is formatted into 3 columns: “instruction”, “output” and “input.” All the data is written in English (BCP-47 en).

    To make the most out of this dataset it is recommended to:

    • Familiarize yourself with the instructions in the instruction column as these provide guidance on how to use the other two columns; input and output.

    • Once comfortable with understanding the instructions columns move onto exploring what you are provided within each 14 sets of triplets – instruction, output and input – that are included in this clean version of Alpaca.

    • Read through many examples paying attention to any areas where you feel more clarification could be added or could be further improved upon for better understanding of language models however bear in mind that these examples have been cleaned from any errors or biases found from original dataset

    • Get inspired! As mentioned earlier there are more than 52k sets provided meaning having much flexibility for varying training strategies or unique approaches when creating your own language model!

    • Finally while not essential it may be helpful to have familiarity with OpenAI's text-davinci engine as well as enjoy playing around with different parameters/options depending on what type of outcomes you wish achieve

    Research Ideas

    • Developing natural language processing (NLP) tasks that aim to better automate and interpret instructions given by humans.
    • Training machine learning models of robotic agents to be able to understand natural language commands, as well as understand the correct action that needs to be taken in response.
    • Creating a system that can generate personalized instructions and feedback in real time based on language models, catering specifically to each individual user's preferences or needs

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

    Columns

    File: train.csv | Column name | Description | |:----------------|:-------------------------------------------------------------------------| | instruction | This column contains the instructions for the language model. (Text) | | output | This column contains the expected output from the language model. (Text) | | input | This column contains the input given to the language model. (Text) |

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit Huggingface Hub.

  10. h

    fine-tuning-docs

    • huggingface.co
    Updated Jul 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ryan (2023). fine-tuning-docs [Dataset]. https://huggingface.co/datasets/kielerrr/fine-tuning-docs
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 21, 2023
    Authors
    Ryan
    Description

    kielerrr/fine-tuning-docs dataset hosted on Hugging Face and contributed by the HF Datasets community

  11. h

    Fine-tuning-data

    • huggingface.co
    Updated Sep 20, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cheolhong Min (2025). Fine-tuning-data [Dataset]. https://huggingface.co/datasets/ch-min/Fine-tuning-data
    Explore at:
    Dataset updated
    Sep 20, 2025
    Authors
    Cheolhong Min
    Description

    ch-min/Fine-tuning-data dataset hosted on Hugging Face and contributed by the HF Datasets community

  12. code_instructions_120k_alpaca

    • kaggle.com
    • huggingface.co
    zip
    Updated Aug 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    William J. Burns (2023). code_instructions_120k_alpaca [Dataset]. https://www.kaggle.com/datasets/wjburns/code-instructions-120k-alpaca
    Explore at:
    zip(62574263 bytes)Available download formats
    Dataset updated
    Aug 23, 2023
    Authors
    William J. Burns
    Description

    This dataset is taken from code_instructions_120k, which adds a prompt column in alpaca style. Refer to the original source here. https://huggingface.co/datasets/iamtarun/code_instructions_120k_alpaca

  13. AI-MATH-LLM-Package

    • kaggle.com
    zip
    Updated Jun 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Johnson chong (2024). AI-MATH-LLM-Package [Dataset]. https://www.kaggle.com/datasets/johnsonhk88/ai-math-llm-package
    Explore at:
    zip(3330554065 bytes)Available download formats
    Dataset updated
    Jun 20, 2024
    Authors
    Johnson chong
    Description

    This Install Package for LLM RAG, fine tuning essential library such as ( HuggingFace hub , transformer, langchain , evalate, sentence-transformers and etc. ) , suitable for Kaggle competition (offline) requirement which download form kaggle development environment.

    Support Package list as below: transformer datasets accelerate bitsandbytes langchain langchain-community sentence-transformers chromadb
    faiss-cpu huggingface_hub langchain-text-splitters
    peft trl umap-learn evaluate deepeval weave

    Suggestion install command in kaggle: !pip install transformers --no-index --no-deps --find-links=file:///kaggle/input/ai-math-llm-package/download-package/tranformers !pip install -U datasets --no-index --no-deps --find-links=file:///kaggle/input/ai-math-llm-package/download-package/datasets !pip install -U accelerate --no-index --no-deps --find-links=file:///kaggle/input/ai-math-llm-package/download-package/accelerate !pip install build --no-index --no-deps --find-links=file:///kaggle/input/ai-math-llm-package/download-package/build-1.2.1-py3-none-any.whl !pip install -U bitsandbytes --no-index --no-deps --find-links=file:///kaggle/input/ai-math-llm-package/download-package/bitsandbytes-0.43.1-py3-none-manylinux_2_24_x86_64.whl !pip install langchain --no-index --no-deps --find-links=file:///kaggle/input/ai-math-llm-package/download-package/langchain-0.2.5-py3-none-any.whl !pip install langchain-core --no-index --no-deps --find-links=file:///kaggle/input/ai-math-llm-package/download-package/langchain_core-0.2.9-py3-none-any.whl !pip install langsmith --no-index --no-deps --find-links=file:///kaggle/input/ai-math-llm-package/download-package/langsmith-0.1.81-py3-none-any.whl !pip install langchain-community --no-index --no-deps --find-links=file:///kaggle/input/ai-math-llm-package/download-package/langchain_community-0.2.5-py3-none-any.whl !pip install sentence-transformers --no-index --no-deps --find-links=file:///kaggle/input/ai-math-llm-package/download-package/sentence_transformers-3.0.1-py3-none-any.whl !pip install chromadb --no-index --no-deps --find-links=file:///kaggle/input/ai-math-llm-package/download-package/chromadb-0.5.3-py3-none-any.whl !pip install faiss-cpu --no-index --no-deps --find-links=file:///kaggle/input/ai-math-llm-package/download-package/faiss_cpu-1.8.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl !pip install -U huggingface_hub --no-index --no-deps --find-links=file:///kaggle/input/ai-math-llm-package/download-package/huggingface_hub !pip install -qU langchain-text-splitters --no-index --no-deps --find-links=file:///kaggle/input/ai-math-llm-package/download-package/langchain_text_splitters-0.2.1-py3-none-any.whl !pip install -U peft --no-index --no-deps --find-links=file:///kaggle/input/ai-math-llm-package/download-package/peft-0.11.1-py3-none-any.whl !pip install -U trl --no-index --no-deps --find-links=file:///kaggle/input/ai-math-llm-package/download-package/trl-0.9.4-py3-none-any.whl !pip install umap-learn --no-index --no-deps --find-links=file:///kaggle/input/ai-math-llm-package/download-package/umap-learn !pip install evaluate --no-index --no-deps --find-links=file:///kaggle/input/ai-math-llm-package/download-package/evaluate-0.4.2-py3-none-any.whl !pip install deepeval --no-index --no-deps --find-links=file:///kaggle/input/ai-math-llm-package/download-package/deepeval-0.21.59-py3-none-any.whl !pip install weave --no-index --no-deps --find-links=file:///kaggle/input/ai-math-llm-package/download-package/weave-0.50.2-py3-none-any.whl

  14. G

    Parameter-Efficient Fine-Tuning Tools Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Oct 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Parameter-Efficient Fine-Tuning Tools Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/parameter-efficient-fine-tuning-tools-market
    Explore at:
    pdf, pptx, csvAvailable download formats
    Dataset updated
    Oct 4, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Parameter-Efficient Fine-Tuning Tools Market Outlook



    According to our latest research, the global parameter-efficient fine-tuning tools market size reached USD 1.42 billion in 2024, reflecting the rapid adoption of advanced AI model customization techniques. The market is poised for robust expansion, with a projected CAGR of 24.7% during the forecast period. By 2033, the market is expected to attain a value of USD 11.6 billion, driven by the increasing demand for scalable, cost-effective, and resource-efficient AI solutions across diverse industries. This impressive growth trajectory is underpinned by the proliferation of large language models, the need for efficient model adaptation, and the surging adoption of AI-driven automation in enterprise applications.




    The most significant growth factor for the parameter-efficient fine-tuning tools market is the exponential rise in the deployment of large-scale AI models, particularly in natural language processing (NLP), computer vision, and speech recognition. These models, such as GPT, BERT, and their derivatives, require substantial computational resources for training and fine-tuning. Traditional full-model fine-tuning methods are often resource-intensive, expensive, and impractical for organizations with limited computational infrastructure. Parameter-efficient fine-tuning techniques—such as adapters, LoRA, and prompt tuning—address these challenges by enabling targeted updates to a small subset of model parameters, drastically reducing hardware requirements and operational costs. This efficiency has democratized access to advanced AI capabilities, empowering a broader spectrum of businesses and research institutes to leverage state-of-the-art machine learning without incurring prohibitive expenses.




    Another major driver fueling the parameter-efficient fine-tuning tools market is the growing need for rapid model customization in dynamic and regulated industries. Enterprises in sectors such as healthcare, finance, and government are increasingly seeking AI solutions that can be quickly adapted to evolving data, compliance requirements, and domain-specific nuances. Parameter-efficient fine-tuning tools allow organizations to update models in a fraction of the time compared to traditional methods, accelerating time-to-market for AI-powered applications. Furthermore, these tools support the preservation of model privacy and security, as they enable fine-tuning on-premises or within secure cloud environments without exposing sensitive data to external parties. This capability is particularly crucial for organizations handling confidential or regulated information, making parameter-efficient fine-tuning an indispensable component of modern AI workflows.




    The surge in open-source innovation and collaboration is also catalyzing market growth. The AI research community has made remarkable strides in developing and sharing parameter-efficient fine-tuning frameworks, libraries, and benchmarks. This collaborative ecosystem has lowered the barrier to entry for developers and enterprises seeking to experiment with and deploy advanced fine-tuning techniques. Open-source tools such as Hugging Face Transformers, PEFT (Parameter-Efficient Fine-Tuning) libraries, and community-driven repositories have become integral to the AI development lifecycle. These resources not only accelerate innovation but also foster interoperability and standardization across the industry, further propelling the adoption of parameter-efficient fine-tuning tools globally.




    From a regional perspective, North America continues to dominate the parameter-efficient fine-tuning tools market, accounting for the largest share in 2024. This leadership is attributed to the region's advanced AI research ecosystem, significant investments in AI infrastructure, and early adoption by technology giants and innovative startups. Europe and Asia Pacific are also emerging as key growth regions, driven by increasing government initiatives, academic research, and enterprise digital transformation efforts. The Asia Pacific region, in particular, is witnessing rapid adoption of AI technologies across sectors such as manufacturing, e-commerce, and telecommunications, which is expected to drive substantial market growth throughout the forecast period.



  15. h

    fine-tuning-datasets

    • huggingface.co
    Updated Oct 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Farren (2025). fine-tuning-datasets [Dataset]. https://huggingface.co/datasets/trainfarren/fine-tuning-datasets
    Explore at:
    Dataset updated
    Oct 24, 2025
    Dataset authored and provided by
    Farren
    Description

    trainfarren/fine-tuning-datasets dataset hosted on Hugging Face and contributed by the HF Datasets community

  16. h

    fine-tuning

    • huggingface.co
    Updated Jun 26, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Peter Patranika (2025). fine-tuning [Dataset]. https://huggingface.co/datasets/fetost/fine-tuning
    Explore at:
    Dataset updated
    Jun 26, 2025
    Authors
    Peter Patranika
    Description

    fetost/fine-tuning dataset hosted on Hugging Face and contributed by the HF Datasets community

  17. h

    ai-job-embedding-finetuning

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shawhin Talebi, ai-job-embedding-finetuning [Dataset]. https://huggingface.co/datasets/shawhin/ai-job-embedding-finetuning
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Shawhin Talebi
    Description

    Dataset for fine-tuning an embedding model for AI job search. Data sourced from datastax/linkedin_job_listings. Data used to fine-tune shawhin/distilroberta-ai-job-embeddings for AI job search. Links

    GitHub Repo Video link Blog link

  18. h

    saferdecoding-fine-tuning

    • huggingface.co
    Updated Dec 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anders Spear (2024). saferdecoding-fine-tuning [Dataset]. https://huggingface.co/datasets/aspear/saferdecoding-fine-tuning
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 19, 2024
    Authors
    Anders Spear
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Card for SaferDecoding Fine Tuning Dataset

    This dataset aims to fine-tune models in an attempt to defend against jailbreak attacks. It is an extension of SafeDecoding

      Dataset Details
    
    
    
    
    
      Dataset Description
    

    The dataset generation process was adapted from SafeDecoding. This dataset includes 252 original human-generated adversarial seed prompts, covering 18 harmful categories. This dataset includes responses generated by Llama2, Vicuna, Dolphin, Falcon… See the full description on the dataset page: https://huggingface.co/datasets/aspear/saferdecoding-fine-tuning.

  19. h

    real-estate-data-for-llm-fine-tuning

    • huggingface.co
    Updated May 11, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Heba (2025). real-estate-data-for-llm-fine-tuning [Dataset]. http://doi.org/10.57967/hf/5361
    Explore at:
    Dataset updated
    May 11, 2025
    Authors
    Heba
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    heba1998/real-estate-data-for-llm-fine-tuning dataset hosted on Hugging Face and contributed by the HF Datasets community

  20. h

    llama3-fine-tuning

    • huggingface.co
    Updated Jul 31, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MrFace (2024). llama3-fine-tuning [Dataset]. https://huggingface.co/datasets/MrFacewhythisnameexists/llama3-fine-tuning
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 31, 2024
    Authors
    MrFace
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    MrFacewhythisnameexists/llama3-fine-tuning dataset hosted on Hugging Face and contributed by the HF Datasets community

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Anders Spear; Connor Dilgren; Hiba El Oirghi; Jost Luebbe; Sahire Ellahy; Hamza Iseric; Anders Spear; Connor Dilgren; Hiba El Oirghi; Jost Luebbe; Sahire Ellahy; Hamza Iseric (2025). SaferDecoding Fine Tuning Dataset [Dataset]. http://doi.org/10.5281/zenodo.14511194
Organization logo

SaferDecoding Fine Tuning Dataset

Explore at:
jsonAvailable download formats
Dataset updated
Jan 7, 2025
Dataset provided by
Hugging Facehttps://huggingface.co/
Authors
Anders Spear; Connor Dilgren; Hiba El Oirghi; Jost Luebbe; Sahire Ellahy; Hamza Iseric; Anders Spear; Connor Dilgren; Hiba El Oirghi; Jost Luebbe; Sahire Ellahy; Hamza Iseric
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This dataset is mirrored on Hugging Face https://huggingface.co/datasets/aspear/saferdecoding-fine-tuning/blob/main/README.md

This dataset aims to fine-tune models in an attempt to defend against jailbreak attacks. It is an extension of SafeDecoding

This dataset includes 252 original human-generated adversarial seed prompts, covering 18 harmful categories.

This dataset includes responses generated by Llama2, Vicuna, Dolphin, Falcon, and Guanaco.

Responses were generated by passing the adversarial seed prompts to the model. Only responses that reject the request were recorded.

Search
Clear search
Close search
Google apps
Main menu