7 datasets found

GPT Roleplay Realm: Enhanced Character
kaggle.com
Updated Nov 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). GPT Roleplay Realm: Enhanced Character [Dataset]. https://www.kaggle.com/datasets/thedevastator/gpt-roleplay-realm-enhanced-character-role-playi
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 30, 2023
Dataset provided by
Kaggle
Authors
The Devastator
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
GPT Roleplay Realm: Enhanced Character Role-playing Dataset

Character Cards and Dialogues for immersive role-playing experiences

By Ilya Gusev (From Huggingface) [source]

About this dataset

The GPT Roleplay Realm dataset is a valuable resource for enhancing the capabilities of language models in the realm of character role-playing. Specifically designed to facilitate immersive role-playing experiences, this dataset is comprised of character cards generated by GPT models. These character cards contain essential information such as names, greetings, example dialogues, context, topics of interest, dialogues involving the characters, and image prompts.

With a focus on enriching language models' ability to engage in dynamic and realistic interactions with fictional characters, this dataset provides users with a diverse range of well-rounded characters to incorporate into their role-playing scenarios. Each character card includes a name that gives them an individual identity and distinction within the narrative.

Additionally, context descriptions offer crucial background information about each character's history or personality traits that can lend depth and authenticity to their portrayal. Greetings act as introductory statements that set the tone for interactions with these virtual personas.

Example dialogues showcase how these characters might converse within specific scenarios or settings. These conversations serve as guidelines for users when constructing interactive narratives or engaging in linguistic exchanges with these language model-generated characters.

Moreover, topics provided on each character card indicate the areas of expertise or interests that are inherent to each persona within the realm created by GPT models. This information enables users to generate dialogue that aligns with each character's unique knowledge base or passions.

Furthermore, dialogues involving additional participants allow for multi-person exchanges and enable more intricate storytelling possibilities within virtual worlds. This feature enhances user engagement by promoting collaborative storytelling among multiple AI-generated characters.

To enhance visual immersion and aid user creativity during role-playing experiences, image prompts are also included on each character card. These suggestive visuals stimulate users' imagination regarding how each character may appear physically based on their described features or characteristics.

In conclusion, by providing extensive details about fictional personas generated by language models via sample dialogues along with their relevant context descriptions, interests/topics listicles paired up provocative visual prompts, the GPT Roleplay Realm dataset elevates the standards of language models in creating immersive and engaging role-playing experiences

How to use the dataset

How to Use This Dataset: GPT Roleplay Realm

Welcome to the GPT Roleplay Realm dataset! This guide will help you navigate and make the best use of this enhanced character role-playing dataset.

Overview

The GPT Roleplay Realm dataset consists of character cards generated by GPT models. These character cards contain names, greetings, example dialogues, context, topics of interest, dialogues involving the characters, and image prompts. The purpose of this dataset is to provide language models with rich information about fictional characters that can be used for immersive role-playing experiences.

Understanding the Columns

The dataset is primarily organized into several columns:

name: The name of the character.

context: A brief description or background information about the character.

greeting: The initial greeting or introduction phrase of each character.

example_dialogue: A sample dialogue or conversation involving each character.

topics: The topics or themes that each character is knowledgeable or interested in.

dialogues: Additional dialogues or conversations involving each character.

image_prompt: Prompts or descriptions for images that represent each character.

Getting Started

When exploring this dataset, it may be helpful to first get a sense of all the available characters by examining their names using the name column.

You can then dive deeper into a specific character's information by exploring their context in order to understand their background and story.

To engage with a specific character in a role-playing scenario, start by using their provided greeting as an introductory statement towards them.

If you want to understand how different characters interact with ...
h
ChatGPT-RealUser-2.2M-preview
huggingface.co
Updated Aug 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gata (2025). ChatGPT-RealUser-2.2M-preview [Dataset]. https://huggingface.co/datasets/Gata-community/ChatGPT-RealUser-2.2M-preview
Explore at:
Dataset updated
Aug 30, 2025
Dataset authored and provided by
Gata
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
ChatGPT-RealUser-2.2M: A Large-Scale Dataset of Real-User, Real-World ChatGPT Conversations

ChatGPT-RealUser-2.2M is a large-scale dataset of real-user, Real-World ChatGPT conversations developed by Gata. From 2024–2025, participants using Gata’s GPT-to-Earn product opted in to share their chats and earned points based on conversation quality. The dataset covers GPT-3.5, GPT-4, and o1 models, and contains 2,244,389 conversations from 15,316 unique users. Because many chats are… See the full description on the dataset page: https://huggingface.co/datasets/Gata-community/ChatGPT-RealUser-2.2M-preview.
h
Bitext-insurance-llm-chatbot-training-dataset
huggingface.co
Updated Aug 24, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bitext (2024). Bitext-insurance-llm-chatbot-training-dataset [Dataset]. https://huggingface.co/datasets/bitext/Bitext-insurance-llm-chatbot-training-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 24, 2024
Dataset authored and provided by
Bitext
License
https://choosealicense.com/licenses/cdla-sharing-1.0/https://choosealicense.com/licenses/cdla-sharing-1.0/
Description
Bitext - Insurance Tagged Training Dataset for LLM-based Virtual Assistants

Overview

This hybrid synthetic dataset is designed to be used to fine-tune Large Language Models such as GPT, Mistral and OpenELM, and has been generated using our NLP/NLG technology and our automated Data Labeling (DAL) tools. The goal is to demonstrate how Verticalization/Domain Adaptation for the [insurance] sector can be easily achieved using our two-step approach to LLM Fine-Tuning. An… See the full description on the dataset page: https://huggingface.co/datasets/bitext/Bitext-insurance-llm-chatbot-training-dataset.
f
Development, system design, safety, and performance metrics of a...
figshare.com
xlsx
Updated Jul 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David Villarreal-Zegarra (2025). Development, system design, safety, and performance metrics of a conversational agent for reducing depressive and anxious symptoms: The MHAI Study [Dataset]. http://doi.org/10.6084/m9.figshare.29606618.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.29606618.v1
Dataset updated
Jul 21, 2025
Dataset provided by
figshare
Authors
David Villarreal-Zegarra
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Background: Conversational agents based on large language models (LLMs) have shown moderate efficacy in reducing depressive and anxiety symptoms. However, most existing evaluations lack methodological transparency, rely on closed-source models, and show limited standardization in performance and safety assessment.Objective: We have two study objectives: (1) to develop an LLM-based conversational agent through system design analysis and initial functionality testing, and (2) to evaluate its safety and performance through standardized assessment in controlled simulated interactions focused on depression and anxiety of two LLMs (GPT-4o and Llama 3.1-8B).Methods: We conducted a cross-sectional study in two phases. First, we developed a mental health platform integrating a conversational agent with functionalities including personalized context, pretrained therapeutic modules, self-assessment tools, and an emergency alert system. Second, we evaluated the agent’s responses in simulated interactions based on predefined user personas for each LLM. Four expert raters assessed 816 interaction pairs using a 5-criterion Likert scale evaluating tone, clarity, domain accuracy (correctness), robustness, completeness, boundaries, target language, and safety. In addition, we use quantitative performance metrics such as cost, response length, and number of tokens. Multiple linear regression models were used to compare LLM performance and assess metric interrelations.Results: First, we developed a web-based mental health platform using a user-centered design, structured into frontend, backend, and database layers. The system integrates therapeutic chat (GPT-4o and Llama 3.1-8B), psychological assessments (PHQ-9, GAD-7), CBT-based tasks, and an emergency alert system. The platform supports secure user authentication, data encryption, multilingual access, and session tracking. Second, GPT-4o outperformed Llama 3.1-8B in both quantitative and qualitative metrics, generating longer and more lexically diverse responses, using more tokens, and scoring higher in clarity, robustness, completeness, boundaries, and target language. However, it incurred higher costs, with no significant differences in tone, accuracy, or safety.Conclusion: Our study presents a conversational agent with multiple functionalities and shows that GPT-4o outperforms Llama 3.1-8B in performance, although at a higher cost. This platform could be used in future clinical trials or real-world implementation studies.
WildChat
huggingface.co
Updated Jul 23, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ai2 (2024). WildChat [Dataset]. https://huggingface.co/datasets/allenai/WildChat
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 23, 2024
Dataset provided by
Allen Institute for AIhttp://allenai.org/
Authors
Ai2
License
https://choosealicense.com/licenses/odc-by/https://choosealicense.com/licenses/odc-by/
Description
Dataset Card for WildChat

Note: a newer version with 4.8 million conversations and demographic information can be found here. Dataset Description

Paper: https://arxiv.org/abs/2405.01470

Interactive Search Tool: https://wildvisualizer.com (paper)

License: ODC-BY

Language(s) (NLP): multi-lingual

Point of Contact: Yuntian Deng

Dataset Summary

WildChat is a collection of 650K conversations between human users and ChatGPT. We collected WildChat… See the full description on the dataset page: https://huggingface.co/datasets/allenai/WildChat.
h
awesome-chatgpt-prompts
huggingface.co
Updated Dec 15, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fatih Kadir Akın (2023). awesome-chatgpt-prompts [Dataset]. https://huggingface.co/datasets/fka/awesome-chatgpt-prompts
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 15, 2023
Authors
Fatih Kadir Akın
License
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Description
🧠 Awesome ChatGPT Prompts [CSV dataset]

This is a Dataset Repository of Awesome ChatGPT Prompts View All Prompts on GitHub

License

CC-0
MSC-Self-Instruct
huggingface.co
Updated Oct 17, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MemGPT (2023). MSC-Self-Instruct [Dataset]. https://huggingface.co/datasets/MemGPT/MSC-Self-Instruct
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 17, 2023
Dataset authored and provided by
MemGPT
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
MemGPT

This is the self-instruct dataset of MSC conversations used for MemGPT paper. For more information please refer to memgpt.ai The MSC dataset is a multi-round human conversations. In this dataset, our goal is to come up with a conversation opener, that is personalized to the user by referencing topics from the previous conversations. These were generated while evaluating MemGPT.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

The Devastator (2023). GPT Roleplay Realm: Enhanced Character [Dataset]. https://www.kaggle.com/datasets/thedevastator/gpt-roleplay-realm-enhanced-character-role-playi

GPT Roleplay Realm: Enhanced Character

Character Cards and Dialogues for immersive role-playing experiences

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Nov 30, 2023

Dataset provided by

Kaggle

Authors

The Devastator

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

GPT Roleplay Realm: Enhanced Character Role-playing Dataset

Character Cards and Dialogues for immersive role-playing experiences

By Ilya Gusev (From Huggingface) [source]

About this dataset

The GPT Roleplay Realm dataset is a valuable resource for enhancing the capabilities of language models in the realm of character role-playing. Specifically designed to facilitate immersive role-playing experiences, this dataset is comprised of character cards generated by GPT models. These character cards contain essential information such as names, greetings, example dialogues, context, topics of interest, dialogues involving the characters, and image prompts.

With a focus on enriching language models' ability to engage in dynamic and realistic interactions with fictional characters, this dataset provides users with a diverse range of well-rounded characters to incorporate into their role-playing scenarios. Each character card includes a name that gives them an individual identity and distinction within the narrative.

Additionally, context descriptions offer crucial background information about each character's history or personality traits that can lend depth and authenticity to their portrayal. Greetings act as introductory statements that set the tone for interactions with these virtual personas.

Example dialogues showcase how these characters might converse within specific scenarios or settings. These conversations serve as guidelines for users when constructing interactive narratives or engaging in linguistic exchanges with these language model-generated characters.

Moreover, topics provided on each character card indicate the areas of expertise or interests that are inherent to each persona within the realm created by GPT models. This information enables users to generate dialogue that aligns with each character's unique knowledge base or passions.

Furthermore, dialogues involving additional participants allow for multi-person exchanges and enable more intricate storytelling possibilities within virtual worlds. This feature enhances user engagement by promoting collaborative storytelling among multiple AI-generated characters.

To enhance visual immersion and aid user creativity during role-playing experiences, image prompts are also included on each character card. These suggestive visuals stimulate users' imagination regarding how each character may appear physically based on their described features or characteristics.

In conclusion, by providing extensive details about fictional personas generated by language models via sample dialogues along with their relevant context descriptions, interests/topics listicles paired up provocative visual prompts, the GPT Roleplay Realm dataset elevates the standards of language models in creating immersive and engaging role-playing experiences

How to use the dataset

How to Use This Dataset: GPT Roleplay Realm

Welcome to the GPT Roleplay Realm dataset! This guide will help you navigate and make the best use of this enhanced character role-playing dataset.

Overview

The GPT Roleplay Realm dataset consists of character cards generated by GPT models. These character cards contain names, greetings, example dialogues, context, topics of interest, dialogues involving the characters, and image prompts. The purpose of this dataset is to provide language models with rich information about fictional characters that can be used for immersive role-playing experiences.

Understanding the Columns

The dataset is primarily organized into several columns:

name: The name of the character.

context: A brief description or background information about the character.

greeting: The initial greeting or introduction phrase of each character.

example_dialogue: A sample dialogue or conversation involving each character.

topics: The topics or themes that each character is knowledgeable or interested in.

dialogues: Additional dialogues or conversations involving each character.

image_prompt: Prompts or descriptions for images that represent each character.

Getting Started

When exploring this dataset, it may be helpful to first get a sense of all the available characters by examining their names using the name column.

You can then dive deeper into a specific character's information by exploring their context in order to understand their background and story.

To engage with a specific character in a role-playing scenario, start by using their provided greeting as an introductory statement towards them.

If you want to understand how different characters interact with ...

Clear search

Close search

Google apps

Main menu

GPT Roleplay Realm: Enhanced Character

GPT Roleplay Realm: Enhanced Character Role-playing Dataset

Character Cards and Dialogues for immersive role-playing experiences

About this dataset

How to use the dataset

How to Use This Dataset: GPT Roleplay Realm

Overview

Understanding the Columns

Getting Started

ChatGPT-RealUser-2.2M-preview

Bitext-insurance-llm-chatbot-training-dataset

Development, system design, safety, and performance metrics of a...

WildChat

awesome-chatgpt-prompts

MSC-Self-Instruct

GPT Roleplay Realm: Enhanced Character

Character Cards and Dialogues for immersive role-playing experiences

GPT Roleplay Realm: Enhanced Character Role-playing Dataset

Character Cards and Dialogues for immersive role-playing experiences

About this dataset

How to use the dataset

How to Use This Dataset: GPT Roleplay Realm

Overview

Understanding the Columns

Getting Started