37 datasets found
  1. Prompt Engineering and Responses Dataset

    • kaggle.com
    zip
    Updated Sep 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Antrixsh Gupta (2023). Prompt Engineering and Responses Dataset [Dataset]. https://www.kaggle.com/datasets/antrixsh/prompt-engineering-and-responses-dataset
    Explore at:
    zip(12776 bytes)Available download formats
    Dataset updated
    Sep 4, 2023
    Authors
    Antrixsh Gupta
    Description

    This dataset is designed to explore the fascinating area of prompt engineering, specifically how different types of prompts can influence the generated text responses. Whether you're interested in natural language processing, conversational agents, or textual analysis, this dataset offers a rich resource for your investigations.

    Features:

    1. Prompt: The textual prompt used for generating a response.
    2. Prompt_Type: The category of the prompt, which can be a Question, Command, or Open-ended statement.
    3. Prompt_Length: The character length of the prompt.
    4. Response: The text generated in response to the prompt.

    Size and Format:

    1. The dataset contains 5010 records and is approximately 705KB in size.
    2. It is provided in CSV format for easy manipulation and analysis.

    Potential Applications:

    Prompt Effectiveness: Study how different types of prompts yield different kinds of responses.

    Conversational Agents: Train and evaluate dialogue systems to better understand user intents.

    Text Generation Models: Analyze how various prompts affect the performance of text generation models like GPT-4.

    Sentiment Analysis: Explore how the tone or sentiment of a prompt influences the tone or sentiment of the response.

    Academic Research: Use the dataset for various NLP or social science research topics related to human-computer interaction, dialogue systems, or machine learning.

  2. Prompt Engineering Dataset

    • kaggle.com
    zip
    Updated Apr 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Austin Fairbanks (2025). Prompt Engineering Dataset [Dataset]. https://www.kaggle.com/datasets/austinfairbanks/prompt-engineering-dataset
    Explore at:
    zip(1614382 bytes)Available download formats
    Dataset updated
    Apr 20, 2025
    Authors
    Austin Fairbanks
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Prompt Engineering Dataset: From Weak to Effective AI Prompts

    Dataset Description

    This comprehensive dataset contains 1,000 examples of prompt engineering transformations, showing how to turn basic, ineffective prompts into powerful, high-quality prompts using established techniques. Each example includes:

    • Task description
    • Complexity level (low, medium, high)
    • Original weak/vague prompt
    • Improved effective prompt
    • Expected response pattern
    • Specific prompting techniques applied
    • Prompt category/type

    Key Features

    • Balanced complexity distribution: 50% medium, 25% low, and 25% high complexity tasks
    • Diverse prompt types: Covers 16 different categories including informational, question-answering, creative writing, code generation, and more
    • Multiple techniques: Demonstrates 11 distinct prompting methods including role prompting, chain of thought, contextual prompting, and one-shot/few-shot examples
    • Real-world applicable: Tasks span domains like science, business, creative writing, data analysis, and technical subjects

    Applications

    • Train models to automatically improve user prompts
    • Study effective prompt engineering patterns and techniques
    • Develop educational materials for teaching prompt engineering
    • Benchmark prompt optimization algorithms
    • Fine-tune LLMs to better understand user intent from minimal instructions

    Methodology

    This dataset was created using a Gemini 2.0 Flash-powered pipeline that generated diverse task descriptions across complexity levels and prompt types, then applied appropriate prompting techniques to create powerful, effective versions of originally weak prompts.

    Citation

    If you use this dataset in your research or applications, please cite: @dataset{oneprompted_prompt_engineering_2024, author = {OneProm.pt}, title = {Prompt Engineering Transformation Dataset}, year = {2024}, publisher = {Kaggle}, url = {https://www.kaggle.com/datasets/oneprompted/prompt-engineering-transformation} }

  3. Prompt Engineering Dataset

    • kaggle.com
    zip
    Updated Mar 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sarfaraz Ahmed (2024). Prompt Engineering Dataset [Dataset]. https://www.kaggle.com/datasets/sarfaraz021/prompt-engineering-dataset
    Explore at:
    zip(25546 bytes)Available download formats
    Dataset updated
    Mar 6, 2024
    Authors
    Sarfaraz Ahmed
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Sarfaraz Ahmed

    Released under CC0: Public Domain

    Contents

  4. Generated Apple Tree Dataset Prompt Engineering

    • kaggle.com
    zip
    Updated Jul 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Prompt Engineering vs Fine-Tuning (2023). Generated Apple Tree Dataset Prompt Engineering [Dataset]. https://www.kaggle.com/datasets/royvoetman/generated-apple-tree-dataset-prompt-engineering
    Explore at:
    zip(614849780 bytes)Available download formats
    Dataset updated
    Jul 12, 2023
    Authors
    Prompt Engineering vs Fine-Tuning
    License

    Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
    License information was derived automatically

    Description

    Generated Apple Tree Dataset (Prompt Engineering) Datasets for the paper: Using Diffusion Models for Dataset Generation: Prompt Engineering vs. Fine-tuning

    Annotation format For each image, there is a txt file with the same name where each row indicates a distinct bounding box.

    A box coordinates are formatted as X1 Y1 X2 Y2 with absolute coordinates.

  5. PROMPT ENGINEER

    • kaggle.com
    zip
    Updated Mar 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rushaliny (2024). PROMPT ENGINEER [Dataset]. https://www.kaggle.com/datasets/rushaliny/promot-engineer
    Explore at:
    zip(520 bytes)Available download formats
    Dataset updated
    Mar 14, 2024
    Authors
    Rushaliny
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This dataset contains information about engineers within a company who have been promoted over a certain time period. The dataset includes features such as engineer ID, years of experience, education level, performance ratings, specialized skills, projects completed, and the promotion outcome (whether promoted or not). It's a valuable resource for analyzing factors contributing to engineer promotions and building predictive models to forecast future promotions within similar contexts.

  6. Prompt Engineering

    • kaggle.com
    zip
    Updated Jan 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Minh Nguyen Dich Nhat (2025). Prompt Engineering [Dataset]. https://www.kaggle.com/datasets/minhnguyendichnhat/prompt-engineering/code
    Explore at:
    zip(6214 bytes)Available download formats
    Dataset updated
    Jan 12, 2025
    Authors
    Minh Nguyen Dich Nhat
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Minh Nguyen Dich Nhat

    Released under MIT

    Contents

  7. 22365_3_Prompt Engineering_v7.pdf

    • kaggle.com
    zip
    Updated Apr 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    shushanth mittpally (2025). 22365_3_Prompt Engineering_v7.pdf [Dataset]. https://www.kaggle.com/datasets/shushanthmittpally/22365-3-prompt-engineering-v7-pdf/code
    Explore at:
    zip(6627613 bytes)Available download formats
    Dataset updated
    Apr 8, 2025
    Authors
    shushanth mittpally
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by shushanth mittpally

    Released under MIT

    Contents

  8. Prompt Injection Malignant

    • kaggle.com
    zip
    Updated Apr 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mary Camila (2024). Prompt Injection Malignant [Dataset]. https://www.kaggle.com/datasets/marycamilainfo/prompt-injection-malignant
    Explore at:
    zip(3216383 bytes)Available download formats
    Dataset updated
    Apr 25, 2024
    Authors
    Mary Camila
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Prompt Injection

    The use of prompts for diverse tasks becomes more prevalent, concerns arise regarding the security of information shared between models and users, as LLMs face vulnerability in receiving potentially harmful prompts with malicious intent from users.

    Vulnerabilities associated with prompt engineering can range from bias and inappropriate responses to cybersecurity issues, raising fundamental questions about the ethics, transparency, and accountability that surround the use of these advanced technologies.

    As the number one of the main current vulnerability of LLMs, prompt injection is the insertion of instructions to alter the expected behavior of the output of a Large Language Model and is usually embedded in the prompt. It can range from simple changes in configured behavior to malicious code snippets that compromise the models integrity and information.

    Dataset Overview

    We introduce a dataset, named Malignant, specifically curated for jailbreak prompt injection instances. A jailbreak attack is based on adversarial inputs, where their purpose is to break the safe model behavior as the model’s output produces harmful content.

    This dataset serves as a valuable resource for future research endeavors aimed at addressing prompt injection vulnerabilities.

    The methodology paper and models already trained scripts can be found here: - https://github.com/llm-security-research/malicious-prompts - https://vinbinary.xyz/malignant_and_promptsentinel.pdf

    Column Description:

    category: Three categories can be found: - jailbreak: We gathered 70 prompts from the jailbreak portal (it is not available since 2024), focusing on the theme of jailbreak attacks and curating with established patterns in such scenarios. Through data augmentation, we produced 129 paraphrased jailbreak prompts. In total, the malignant dataset consists of 199 jailbreak prompts. - act_as: We augmented the robustness of model detection for jailbreak prompt injection by introducing hard prompts. A distinct category for hard prompts is integrated into the malignant dataset, sourced from the AweosomeChatGPT portal. Also referred to as manual prompts, these inputs serve as role prompts to condition the context, influencing the behavior of the language model. With 24 initially collected prompts, we applied the rephrase method for dataset augmentation, yielding a total of 69 hard prompts after a results review. - conversation: In order to evaluate a model to detect jailbreak prompts, conversation prompts for model training were extracted solely from the Persona-Chat dataset, with a total of 1312 prompts included.

    base_class: Six categories can be found:
    - paraphrase: Data augmentation was performed on jailbreak prompts to achieve better results in model training. - conversation: Phrase collected from Persona-Chat dataset. - role_play: - output_constraint: - privilege_escalation:

    text: The string phrase collected from the datasources listed below.

    embedding: Text embeddings generated using the model paraphrase-multilingual-MiniLM-L12-v2 from SentenceTransformers to generate 384 dimensional embeddings.

    As the only public dataset available to our knowledge at this time, we hope it can be useful for researchers and people who are concerned about AI ethics and want to make a difference!

  9. Foundations of LLMs and Prompt Engineering

    • kaggle.com
    zip
    Updated Sep 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ryan K. Adams (2025). Foundations of LLMs and Prompt Engineering [Dataset]. https://www.kaggle.com/datasets/ryankadams/foundations-of-llms-and-prompt-engineering/code
    Explore at:
    zip(8433 bytes)Available download formats
    Dataset updated
    Sep 18, 2025
    Authors
    Ryan K. Adams
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This resource serves as a comprehensive guide to understanding the Foundations of Large Language Models (LLMs) and the principles behind Prompt Engineering. It provides essential information on how LLMs like GPT-3, BERT, and T5 work, along with practical examples of how to optimize prompts for specific tasks, improving model performance and output quality.

  10. Generated Apple Tree Dataset Fine-tuning

    • kaggle.com
    zip
    Updated Jul 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Prompt Engineering vs Fine-Tuning (2023). Generated Apple Tree Dataset Fine-tuning [Dataset]. https://www.kaggle.com/datasets/royvoetman/generated-apple-trees-fine-tuning
    Explore at:
    zip(623019108 bytes)Available download formats
    Dataset updated
    Jul 11, 2023
    Authors
    Prompt Engineering vs Fine-Tuning
    License

    Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
    License information was derived automatically

    Description

    Generated Apple Tree Dataset (Fine Tuning) Datasets for the paper: Using Diffusion Models for Dataset Generation: Prompt Engineering vs. Fine-tuning

    Annotation format For each image, there is a txt file with the same name where each row indicates a distinct bounding box.

    These annotations conform to the YOLO Ultralytics annotation format as described: https://docs.ultralytics.com/yolov5/tutorials/train_custom_data/#12-create-labels_1 which specifies the "class x_center y_center width height" format with relative coordinates.

  11. System Prompts from Leading LLM Services

    • kaggle.com
    zip
    Updated Jul 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Volodymyr Pivoshenko 🇺🇦 (2025). System Prompts from Leading LLM Services [Dataset]. https://www.kaggle.com/datasets/volodymyrpivoshenko/system-prompts-from-leading-llm-services
    Explore at:
    zip(150566 bytes)Available download formats
    Dataset updated
    Jul 10, 2025
    Authors
    Volodymyr Pivoshenko 🇺🇦
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    A curated dataset featuring system prompts sourced from major Large Language Model (LLM) providers such as OpenAI, Google, Anthropic, and more. This collection is designed to support research, benchmark, and innovation in prompt engineering by offering a diverse range of real-world prompts used to guide and control state-of-the-art language models.

  12. LLM_UTILITIES

    • kaggle.com
    zip
    Updated Mar 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Strider (2025). LLM_UTILITIES [Dataset]. https://www.kaggle.com/someshchatterjee/llm-utilities
    Explore at:
    zip(13832 bytes)Available download formats
    Dataset updated
    Mar 9, 2025
    Authors
    Strider
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description
  13. Prompt Injection payloads

    • kaggle.com
    zip
    Updated Jan 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    aestera (2025). Prompt Injection payloads [Dataset]. https://www.kaggle.com/datasets/aestera/prompt-injection-payloads
    Explore at:
    zip(11375 bytes)Available download formats
    Dataset updated
    Jan 11, 2025
    Authors
    aestera
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by aestera

    Released under MIT

    Contents

  14. TechPrompt-QA

    • kaggle.com
    zip
    Updated May 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Emam (2024). TechPrompt-QA [Dataset]. https://www.kaggle.com/datasets/elemam/quetions-on-computer-science-llms
    Explore at:
    zip(1228038220 bytes)Available download formats
    Dataset updated
    May 18, 2024
    Authors
    Emam
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset Description: TechPrompt-QA

    Overview: TechPrompt-QA is a curated dataset containing high-quality questions related to technology and computer science. Filtered from the Open-Orca Augmented Flan Dataset and other prompt-based datasets, this collection is ideal for training LLMs (Large Language Models), chatbots, and NLP models in technical domains.

    Features:

    Technology & Computer Science Focused: Covers a wide range of topics, including programming, AI, cybersecurity, cloud computing, databases, networking, and software engineering.
    Diverse Question Types: Includes multiple-choice, open-ended, coding-related, and theoretical questions.
    Prompt-Based Structure: Well-formatted and structured for easy integration into prompt engineering workflows.
    AI Model Training Ready: Useful for training models in Q&A generation, retrieval-augmented generation (RAG), and knowledge-based AI applications.
    

    Use Cases:

    Fine-tuning LLMs for tech-focused Q&A systems.
    Improving chatbots and virtual assistants in technology-related domains.
    Enhancing question-answering datasets for education and research.
    Evaluating AI models on technical reasoning and problem-solving skills.
    

    Dataset Format:

    Columns: Question, Answer (if available), Topic, Difficulty Level (if applicable).
    Available Formats: JSON, CSV, Parquet.
    

    This dataset is designed to support developers, researchers, and AI enthusiasts in building smarter and more accurate technical AI models. 🚀

  15. Sentiment Analysis Dataset

    • kaggle.com
    Updated May 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Samarth Kuchya (2024). Sentiment Analysis Dataset [Dataset]. https://www.kaggle.com/datasets/samarthkumarkuchya/sentiment-analysis-dataset/versions/1
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 27, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Samarth Kuchya
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This data has been created using prompt engineering over chatGPT which has following labels - 0 - negative 1 - neutral 2 - positive

  16. GPT-OSS-20B Adversarial Prompt Catalog v0

    • kaggle.com
    zip
    Updated Aug 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jeff Borschowa (2025). GPT-OSS-20B Adversarial Prompt Catalog v0 [Dataset]. https://www.kaggle.com/datasets/jeffborschowa/gpt-oss-20b-adversarial-prompt-catalog-v0
    Explore at:
    zip(567 bytes)Available download formats
    Dataset updated
    Aug 14, 2025
    Authors
    Jeff Borschowa
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    📂 GPT-OSS-20B Adversarial Prompt Catalog

    Author: [Your Name]
    License: CC BY 4.0
    Tags: openai, gpt-oss-20b, red-teaming, ai-safety, prompt-engineering

    📌 Overview

    This dataset contains a curated set of adversarial prompts and their associated metadata from red-teaming runs against OpenAI’s GPT-OSS-20B model.
    All unsafe outputs are redacted or hashed to ensure compliance with Kaggle policy.

    📊 Schema

    ColumnTypeDescription
    prompt_redactedstringThe adversarial prompt text with unsafe content replaced by [REDACTED].
    categorystringSafety category (e.g., misinformation, self-harm, disallowed content).
    patternstringPrompt pattern/technique used (e.g., CoU, instruction-hierarchy, obfuscation).
    stepsstringMinimal reproducible steps for this prompt.
    reproduction_notesstringAdditional notes on reproducing the failure.
    outcome_labelstringOutcome classification (e.g., refusal, partial compliance, unsafe).

    📜 Ethics Statement

    • No unsafe or disallowed content is included in plaintext.
    • All examples are for AI safety research and responsible disclosure.
    • This dataset complies with Kaggle’s content guidelines.

    🔗 Related Resources

    📜 Citation

    If you use this dataset, please cite this dataset page and the competition link.

  17. prompts

    • kaggle.com
    zip
    Updated Jun 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    @Ravi (2023). prompts [Dataset]. https://www.kaggle.com/datasets/raviiloveyou/prompts/suggestions
    Explore at:
    zip(12419 bytes)Available download formats
    Dataset updated
    Jun 21, 2023
    Authors
    @Ravi
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Prompts play a crucial role in guiding language models like ChatGPT to generate relevant and coherent responses. They serve as instructions or cues that provide context and steer the model's understanding and output. Effective prompts can shape the conversation, elicit specific information, or encourage creative responses. Prompt engineering, on the other hand, refers to the process of designing and refining prompts to achieve desired outcomes. Both prompts and prompt engineering are important for several reasons

    prompts and prompt engineering are essential for guiding language models, enabling control over outputs, generating desired content, fostering creativity, and enhancing the overall user experience. They form a critical component in the interaction between users and AI systems, ensuring meaningful and contextually appropriate conversations. This is one of the inspiration behind this dataset.

    In this dataset we generated this prompts samples by various chatbots and few from Bard and from ChatGpt. the main intention and idea behind that is 1) Prompt Engineering 2) Rich data . This type of few samples of prompt which for helpful for training various generative ai applications.but in this dataset the prompts samples are low amount .but you generate synthetic data from that .

  18. Google AI Whitepaper knowledgebase

    • kaggle.com
    zip
    Updated Apr 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    malavp1703 (2025). Google AI Whitepaper knowledgebase [Dataset]. https://www.kaggle.com/datasets/malavp1703/google-ai-whitepaer-knowledgebase
    Explore at:
    zip(62415619 bytes)Available download formats
    Dataset updated
    Apr 20, 2025
    Authors
    malavp1703
    Description

    This dataset is a ground of whitepapers shared by Google in its AI workshop. It is a knowledgebase on various GenAI topics including prompt engineering, vector databases, embeddings, RAG, Agents, Agent companions, fine tuning and use of MLops in GenAI planning.

  19. AI Safety Verification Dataset

    • kaggle.com
    zip
    Updated Aug 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Priyam Saha (2025). AI Safety Verification Dataset [Dataset]. https://www.kaggle.com/datasets/priyamsaha17/ai-safety-verification-dataset
    Explore at:
    zip(147706374 bytes)Available download formats
    Dataset updated
    Aug 25, 2025
    Authors
    Priyam Saha
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Overview An aggregated, cleaned, and unified dataset assembled from the AI Verify Foundation’s Project Moonshot resources on Kaggle. It combines: (a) prompt templates and prompt-engineering cookbooks, (b) pre-built recipes used to configure benchmark runs (input/target pairs, evaluation metric, grading scales), and (c) metric definitions/outputs for automated evaluation. The material is intended to support reproducible LLM benchmarking, bias/fairness analysis, and prompt-engineering experiments.

    Project Moonshot Project Moonshot is an open-source LLM evaluation toolkit produced by the AI Verify Foundation; it brings benchmarking and red-teaming workflows together and publishes prompt templates, recipes and metrics on GitHub and the Moonshot docs site. Link - https://aiverifyfoundation.sg/project-moonshot/

    Recipe Recipes (in Moonshot) are pre-built benchmark configurations: JSON files that define the dataset (input / target pairs), the prompt template to use, the evaluation metric, and any grading thresholds — enabling reproducible, repeatable test runs. The Moonshot project publishes many such recipes for different evaluation categories (e.g., prompt injection, cybersecurity).

    Cookbook Cookbook (in ML/prompting context) is a curated collection of patterns, examples and “how-to” snippets for solving common tasks with LLMs (templates, best practices, and worked examples). Think of a cookbook as a higher-level collection that organizes recipes and templates for reuse

    Intended uses - Reproducible LLM benchmarking and regression testing. - Bias and fairness audits (compare performance across social attribute groups). - Prompt engineering research (compare prompt templates / recipe variants). - Building evaluation pipelines that combine semantic and factual checks.

    Credits: This dataset aggregates content published by the AI Verify Foundation / Project Moonshot. Please follow the original project’s license and attribution requirements when redistributing. See the Moonshot repository for license details. URL: https://aiverifyfoundation.sg/project-moonshot/ GitHub: https://github.com/aiverify-foundation/moonshot

  20. Google_GenAI_Intensive_April_2025

    • kaggle.com
    zip
    Updated Apr 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Todd Gardiner (2025). Google_GenAI_Intensive_April_2025 [Dataset]. https://www.kaggle.com/datasets/toddgardiner/google-genai-intensive-april-2025
    Explore at:
    zip(55765418 bytes)Available download formats
    Dataset updated
    Apr 8, 2025
    Authors
    Todd Gardiner
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    From March 31 through April 4, 2025, Google experts went through foundational gen AI topics like prompt engineering, evaluations, and embeddings. Coursework included these whitepapers by Google experts, AI-generated podcast outputs (NotebookLM), and practical code labs for hands-on experience with Gemini and other services. These are posted for people to use in their capstone projects at the end of the course.

    If you need to see how the text CSV was generated, the code is here https://www.kaggle.com/code/toddgardiner/google-5-day-genai-intensive-whitepapers-to-text/.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Antrixsh Gupta (2023). Prompt Engineering and Responses Dataset [Dataset]. https://www.kaggle.com/datasets/antrixsh/prompt-engineering-and-responses-dataset
Organization logo

Prompt Engineering and Responses Dataset

Exploring the Influence of Different Prompt Types on Text Responses

Explore at:
3 scholarly articles cite this dataset (View in Google Scholar)
zip(12776 bytes)Available download formats
Dataset updated
Sep 4, 2023
Authors
Antrixsh Gupta
Description

This dataset is designed to explore the fascinating area of prompt engineering, specifically how different types of prompts can influence the generated text responses. Whether you're interested in natural language processing, conversational agents, or textual analysis, this dataset offers a rich resource for your investigations.

Features:

  1. Prompt: The textual prompt used for generating a response.
  2. Prompt_Type: The category of the prompt, which can be a Question, Command, or Open-ended statement.
  3. Prompt_Length: The character length of the prompt.
  4. Response: The text generated in response to the prompt.

Size and Format:

  1. The dataset contains 5010 records and is approximately 705KB in size.
  2. It is provided in CSV format for easy manipulation and analysis.

Potential Applications:

Prompt Effectiveness: Study how different types of prompts yield different kinds of responses.

Conversational Agents: Train and evaluate dialogue systems to better understand user intents.

Text Generation Models: Analyze how various prompts affect the performance of text generation models like GPT-4.

Sentiment Analysis: Explore how the tone or sentiment of a prompt influences the tone or sentiment of the response.

Academic Research: Use the dataset for various NLP or social science research topics related to human-computer interaction, dialogue systems, or machine learning.

Search
Clear search
Close search
Google apps
Main menu