100+ datasets found

h
ai-medical-chatbot
huggingface.co
Updated Feb 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ruslan Magana Vsevolodovna (2024). ai-medical-chatbot [Dataset]. https://huggingface.co/datasets/ruslanmv/ai-medical-chatbot
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 16, 2024
Authors
Ruslan Magana Vsevolodovna
Description
AI Medical Chatbot Dataset

This is an experimental Dataset designed to run a Medical Chatbot It contains at least 250k dialogues between a Patient and a Doctor.

Playground ChatBot

ruslanmv/AI-Medical-Chatbot For furter information visit the project here: https://github.com/ruslanmv/ai-medical-chatbot
h
chatbot_arena_conversations
huggingface.co
Updated Jul 18, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Large Model Systems Organization (2023). chatbot_arena_conversations [Dataset]. https://huggingface.co/datasets/lmsys/chatbot_arena_conversations
Explore at:
Dataset updated
Jul 18, 2023
Dataset authored and provided by
Large Model Systems Organization
License
https://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/
Description
Chatbot Arena Conversations Dataset

This dataset contains 33K cleaned conversations with pairwise human preferences. It is collected from 13K unique IP addresses on the Chatbot Arena from April to June 2023. Each sample includes a question ID, two model names, their full conversation text in OpenAI API JSON format, the user vote, the anonymized user ID, the detected language tag, the OpenAI moderation API tag, the additional toxic tag, and the timestamp. To ensure the safe release… See the full description on the dataset page: https://huggingface.co/datasets/lmsys/chatbot_arena_conversations.
Mental Health Conversational Data
kaggle.com
Updated Oct 31, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
elvis (2022). Mental Health Conversational Data [Dataset]. https://www.kaggle.com/datasets/elvis23/mental-health-conversational-data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 31, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
elvis
Description
A dataset containing basic conversations, mental health FAQ, classical therapy conversations, and general advice provided to people suffering from anxiety and depression.

This dataset can be used to train a model for a chatbot that can behave like a therapist in order to provide emotional support to people with anxiety & depression.

The dataset contains intents. An “intent” is the intention behind a user's message. For instance, If I were to say “I am sad” to the chatbot, the intent, in this case, would be “sad”. Depending upon the intent, there is a set of Patterns and Responses appropriate for the intent. Patterns are some examples of a user’s message which aligns with the intent while Responses are the replies that the chatbot provides in accordance with the intent. Various intents are defined and their patterns and responses are used as the model’s training data to identify a particular intent.
h
Bitext-travel-llm-chatbot-training-dataset
huggingface.co
Updated Jun 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bitext (2025). Bitext-travel-llm-chatbot-training-dataset [Dataset]. https://huggingface.co/datasets/bitext/Bitext-travel-llm-chatbot-training-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 21, 2025
Dataset authored and provided by
Bitext
License
https://choosealicense.com/licenses/cdla-sharing-1.0/https://choosealicense.com/licenses/cdla-sharing-1.0/
Description
Bitext - Travel Tagged Training Dataset for LLM-based Virtual Assistants

Overview

This hybrid synthetic dataset is designed to be used to fine-tune Large Language Models such as GPT, Mistral and OpenELM, and has been generated using our NLP/NLG technology and our automated Data Labeling (DAL) tools. The goal is to demonstrate how Verticalization/Domain Adaptation for the [Travel] sector can be easily achieved using our two-step approach to LLM Fine-Tuning. An overview of… See the full description on the dataset page: https://huggingface.co/datasets/bitext/Bitext-travel-llm-chatbot-training-dataset.
F
Hindi Conversation Chat Dataset for Real Estate Domain
futurebeeai.com
wav
Updated Aug 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Hindi Conversation Chat Dataset for Real Estate Domain [Dataset]. https://www.futurebeeai.com/dataset/text-dataset/hindi-realestate-domain-conversation-text-dataset
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Dataset funded by
FutureBeeAI
Description
Introduction
The dataset comprises over 12,000 chat conversations, each focusing on specific Real Estate related topics. Each conversation provides a detailed interaction between a call center agent and a customer, capturing real-life scenarios and language nuances.
•
Participants Details: 200+ native Hindi participants from the FutureBeeAI community.

•
Word Count & Length: Chats are diverse, averaging 300 to 700 words and 50 to 150 turns across both speakers.

Topic Diversity
The chat dataset covers a wide range of conversations on Real Estate topics, ensuring that the dataset is comprehensive and relevant for training and fine-tuning models for various Real Estate use cases. It offers diversity in terms of conversation topics, chat types, and outcomes, including both inbound and outbound chats with positive, neutral, and negative outcomes.
•Inbound Chats:
•Property Inquiry
•Rental Property Search & Availability
•Renovation Inquiries
•Property Features & Amenities Inquiry
•Investment Property Analysis & Advice
•Property History & Ownership Details, and many more
•Outbound Chats:
•New Property Listing Update
•Post Purchase Follow-ups
•Investment Opportunities & Property Recommendations
•Property Value Updates
•Customer Satisfaction Surveys, and many more
Language Variety & Nuances
The conversations in this dataset capture the diverse language styles and expressions prevalent in Hindi Real Estate interactions. This diversity ensures the dataset accurately represents the language used by Hindi speakers in Real Estate contexts.
The dataset encompasses a wide array of language elements, including:
•
Naming Conventions: Chats include a variety of Hindi personal and business names.

•
Localized Details: Real-world addresses, emails, phone numbers, and other contact information as according to different Hindi-speaking regions.

•
Temporal and Numeric Expressions: Dates, times, currencies, and numbers in Hindi forms, adhering to local conventions.

•
Idiomatic Expressions and Slang: It includes local slang, idioms, and informal phrase present in Hindi Real Estate conversations.

This linguistic authenticity ensures that the dataset equips researchers and developers with a comprehensive understanding of the intricate language patterns, cultural references, and communication styles inherent to Hindi Real Estate interactions.
Conversational Flow and Interaction Types
The dataset includes a broad range of conversations, from simple inquiries to detailed discussions, capturing the dynamic nature of Real Estate customer-agent interactions.
•Simple Inquiries
•Detailed Discussions
•Transactional Interactions
•Problem-Solving Dialogues
•Advisory Sessions
•Routine Checks and Follow-Ups
Each of these conversations contains various aspects of conversation flow like:
•Greetings
•Authentication
•Information gathering
•Resolution identification
•Solution Delivery
•Closing and Follow-ups
<span
F
Gujarati Conversation Chat Dataset for Travel Domain
futurebeeai.com
wav
Updated Aug 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Gujarati Conversation Chat Dataset for Travel Domain [Dataset]. https://www.futurebeeai.com/dataset/text-dataset/gujarati-travel-domain-conversation-text-dataset
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Dataset funded by
FutureBeeAI
Description
Introduction
The dataset comprises over 12,000 chat conversations, each focusing on specific Travel related topics. Each conversation provides a detailed interaction between a call center agent and a customer, capturing real-life scenarios and language nuances.
•
Participants Details: 200+ native Gujarati participants from the FutureBeeAI community.

•
Word Count & Length: Chats are diverse, averaging 300 to 700 words and 50 to 150 turns across both speakers.

Topic Diversity
The chat dataset covers a wide range of conversations on Travel topics, ensuring that the dataset is comprehensive and relevant for training and fine-tuning models for various Travel use cases. It offers diversity in terms of conversation topics, chat types, and outcomes, including both inbound and outbound chats with positive, neutral, and negative outcomes.
•Inbound Calls:
•Booking Inquiries & Assistance
•Destination Information & Recommendations
• Flight Delays or Cancellation Assistance
•Assistance for Disable Passengers
•Travel-related Health & Safety Inquiry
•Lost or Delayed Baggage Assistance, and many more
•Outbound Calls:
•Promotional Offers & Package Deals
•Customer Satisfaction Surveys
•Booking Confirmations & Updates
•Flight Schedule Changes & Notifications
•Customer Feedback Collection
•Visa Expiration Reminders, and many more
Language Variety & Nuances
The conversations in this dataset capture the diverse language styles and expressions prevalent in Gujarati Travel interactions. This diversity ensures the dataset accurately represents the language used by Gujarati speakers in Travel contexts.
The dataset encompasses a wide array of language elements, including:
•
Naming Conventions: Chats include a variety of Gujarati personal and business names.

•
Localized Details: Real-world addresses, emails, phone numbers, and other contact information as according to different Gujarati-speaking regions.

•
Temporal and Numeric Expressions: Dates, times, currencies, and numbers in Gujarati forms, adhering to local conventions.

•
Idiomatic Expressions and Slang: It includes local slang, idioms, and informal phrase present in Gujarati Travel conversations.

This linguistic authenticity ensures that the dataset equips researchers and developers with a comprehensive understanding of the intricate language patterns, cultural references, and communication styles inherent to Gujarati Travel interactions.
Conversational Flow and Interaction Types
The dataset includes a broad range of conversations, from simple inquiries to detailed discussions, capturing the dynamic nature of Travel customer-agent interactions.
•Simple Inquiries
•Detailed Discussions
•Transactional Interactions
•Problem-Solving Dialogues
•Advisory Sessions
•Routine Checks and Follow-Ups
Each of these conversations contains various aspects of conversation flow like:
•Greetings
•Authentication
•Information gathering
•Resolution identification
•Solution Delivery
<span
F
Bahasa Conversation Chat Dataset for Telecom Domain
futurebeeai.com
wav
Updated Aug 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Bahasa Conversation Chat Dataset for Telecom Domain [Dataset]. https://www.futurebeeai.com/dataset/text-dataset/bahasa-telecom-domain-conversation-text-dataset
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Dataset funded by
FutureBeeAI
Description
Introduction
The dataset comprises over 10,000 chat conversations, each focusing on specific Telecom related topics. Each conversation provides a detailed interaction between a call center agent and a customer, capturing real-life scenarios and language nuances.
•
Participants Details: 150+ native Bahasa participants from the FutureBeeAI community.

•
Word Count & Length: Chats are diverse, averaging 300 to 700 words and 50 to 150 turns across both speakers.

Topic Diversity
The chat dataset covers a wide range of conversations on Telecom topics, ensuring that the dataset is comprehensive and relevant for training and fine-tuning models for various Telecom use cases. It offers diversity in terms of conversation topics, chat types, and outcomes, including both inbound and outbound chats with positive, neutral, and negative outcomes.
•Inbound Chats:
•Phone Number Porting
•Network Connectivity Issues
•Billing and Payments
•Technical Support
•Service Activation
•International Roaming Enquiry
•Refunds and Billing Adjustments
•Emergency Service Access, and many more
•Outbound Chats:
•Welcome Calls / Onboarding Process
•Payment Reminders
•Customer Surveys
•Technical Updates
•Service Usage Reviews
•Network Complaint Update, and many more
Language Variety & Nuances
The conversations in this dataset capture the diverse language styles and expressions prevalent in Bahasa Telecom interactions. This diversity ensures the dataset accurately represents the language used by Bahasa speakers in Telecom contexts.
The dataset encompasses a wide array of language elements, including:
•
Naming Conventions: Chats include a variety of Bahasa personal and business names.

•
Localized Details: Real-world addresses, emails, phone numbers, and other contact information as according to different Bahasa-speaking regions.

•
Temporal and Numeric Expressions: Dates, times, currencies, and numbers in Bahasa forms, adhering to local conventions.

•
Idiomatic Expressions and Slang: It includes local slang, idioms, and informal phrase present in Bahasa Telecom conversations.

This linguistic authenticity ensures that the dataset equips researchers and developers with a comprehensive understanding of the intricate language patterns, cultural references, and communication styles inherent to Bahasa Telecom interactions.
Conversational Flow and Interaction Types
The dataset includes a broad range of conversations, from simple inquiries to detailed discussions, capturing the dynamic nature of Telecom customer-agent interactions.
•Simple Inquiries
•Detailed Discussions
•Transactional Interactions
•Problem-Solving Dialogues
•Advisory Sessions
•Routine Checks and Follow-Ups
Each of these conversations contains various aspects of conversation flow like:
•Greetings
•Authentication
•Information gathering
•Resolution identification
<span
French Conversations (from movie subtitles)
kaggle.com
Updated Aug 3, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dali Selmi (2023). French Conversations (from movie subtitles) [Dataset]. https://www.kaggle.com/datasets/daliselmi/french-conversational-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 3, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Dali Selmi
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
French
Description
French Movie Subtitle Conversations Dataset

Description

Dive into the world of French dialogue with the French Movie Subtitle Conversations dataset – a comprehensive collection of over 127,000 movie subtitle conversations. This dataset offers a deep exploration of authentic and diverse conversational contexts spanning various genres, eras, and scenarios. It is thoughtfully organized into three distinct sets: training, testing, and validation.

Content Overview

Each conversation in this dataset is structured as a JSON object, featuring three key attributes:

Context: Get a holistic view of the conversation's flow with the preceding 9 lines of dialogue. This context provides invaluable insights into the conversation's dynamics and contextual cues.

Knowledge: Immerse yourself in a wide range of thematic knowledge. This dataset covers an array of topics, ensuring that your models receive exposure to diverse information sources for generating well-informed responses.

Response: Explore how characters react and respond across various scenarios. From casual conversations to intense emotional exchanges, this dataset encapsulates the authenticity of genuine human interaction.

Data Sample

Here's a snippet from the dataset to give you an idea of its structure:

[ { "context": [ "Tu as attendu longtemps?", "Oui en effet.", "Je pense que c' est grossier pour un premier rencard.", // ... (6 more lines of context) ], "knowledge": "", "response": "On n' avait pas dit 9h?" }, // ... (more data samples) ]

Use Cases

The French Movie Subtitle Conversations dataset serves as a valuable resource for several applications:

Conversational AI: Train advanced chatbots and dialogue systems in French that can engage users in fluid, contextually aware conversations.

Language Modeling: Enhance your language models by leveraging diverse dialogue patterns, colloquialisms, and contextual dependencies present in real-world conversations.

Sentiment Analysis: Investigate the emotional tones of conversations across different movie genres and periods, contributing to a better understanding of sentiment variation.

Why This Dataset

Size and Diversity: With a vast collection of over 127,000 conversations spanning diverse genres and tones, this dataset offers an unparalleled breadth and depth in French dialogue data.

Contextual Richness: The inclusion of context empowers researchers and practitioners to explore the dynamics of conversation flow, leading to more accurate and contextually relevant responses.

Real-world Relevance: Originating from movie subtitles, this dataset mirrors real-world interactions, making it a valuable asset for training models that understand and generate human-like dialogue.

Acknowledgments

We extend our gratitude to the movie subtitle community for their contributions, which have enabled the creation of this diverse and comprehensive French dialogue dataset.

Unlock the potential of authentic French conversations today with the French Movie Subtitle Conversations dataset. Engage in state-of-the-art research, enhance language models, and create applications that resonate with the nuances of real dialogue.
h
mental_health_chatbot_dataset
huggingface.co
opendatalab.com
Updated Jul 21, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Arun Brahma (2023). mental_health_chatbot_dataset [Dataset]. https://huggingface.co/datasets/heliosbrahma/mental_health_chatbot_dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 21, 2023
Authors
Arun Brahma
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset Card for "heliosbrahma/mental_health_chatbot_dataset"

Dataset Description Dataset Summary

This dataset contains conversational pair of questions and answers in a single text related to Mental Health. Dataset was curated from popular healthcare blogs like WebMD, Mayo Clinic and HeatlhLine, online FAQs etc. All questions and answers have been anonymized to remove any PII data and pre-processed to remove any unwanted characters.

Languages

The… See the full description on the dataset page: https://huggingface.co/datasets/heliosbrahma/mental_health_chatbot_dataset.
k
ExpBot - A dataset of 79 dialogs with an experimental customer service...
radar.kit.edu
radar-service.eu
tar
Updated Jun 21, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Mädche; Jasper Feine; Stefan Morana; Ulrich Gnewuch (2023). ExpBot - A dataset of 79 dialogs with an experimental customer service chatbot [Dataset]. http://doi.org/10.35097/1210
Explore at:
tar(251904 bytes)Available download formats
Unique identifier
https://doi.org/10.35097/1210
Dataset updated
Jun 21, 2023
Dataset provided by
Karlsruhe Institute of Technology
Gnewuch, Ulrich
Mädche, Alexander
Feine, Jasper
Morana, Stefan
Authors
Alexander Mädche; Jasper Feine; Stefan Morana; Ulrich Gnewuch
Description
This dataset consists of 79 dialogs between a human user and a chatbot in English language. This data was collected during an online experiment conducted by the research group "Information Systems & Service Design" at the Karlsruhe Institute of Technology (KIT). Experimental task: Participants were asked to interact with a chatbot to find out whether they could save money by switching to a better mobile phone plan. Additionally, there were shown a fictitious copy of last month's mobile phone bill. During the conversation, the chatbot asked about the participant's usage patterns (e.g., how much data was used) and recommended a randomly generated plan that better met the participant’s requirements. For more information, see Gnewuch et al. (2018). If you have any questions, please contact us via email (info@chatbotresearch.com) or visit https://chatbotresearch.com. WARNING! Some dialogs contain profanity and/or offensive language. Profanity was not removed because it is important for calculating sentiment scores. PUBLICATIONS / REFERENCES Gnewuch, U., Morana, S., Adam, M. T. P., and Maedche, A. 2018. “Faster Is Not Always Better: Understanding the Effect of Dynamic Response Delays in Human-Chatbot Interaction,” in Proceedings of the 26th European Conference on Information Systems (ECIS 2018), Portsmouth, United Kingdom. Feine, J., Morana, S., and Gnewuch, U. 2019. “Measuring Service Encounter Satisfaction with Customer Service Chatbots using Sentiment Analysis,” in Proceedings of the 14th International Conference on Wirtschaftsinformatik (WI 2019), Siegen, Germany, February 24–27.
F
English Conversation Chat Dataset for Healthcare Domain
futurebeeai.com
wav
Updated Aug 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). English Conversation Chat Dataset for Healthcare Domain [Dataset]. https://www.futurebeeai.com/dataset/text-dataset/english-healthcare-domain-conversation-text-dataset
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Dataset funded by
FutureBeeAI
Description
Introduction
The dataset comprises over 12,000 chat conversations, each focusing on specific Healthcare related topics. Each conversation provides a detailed interaction between a call center agent and a customer, capturing real-life scenarios and language nuances.
•
Participants Details: 200+ native English participants from the FutureBeeAI community.

•
Word Count & Length: Chats are diverse, averaging 300 to 700 words and 50 to 150 turns across both speakers.

Topic Diversity
The chat dataset covers a wide range of conversations on Healthcare topics, ensuring that the dataset is comprehensive and relevant for training and fine-tuning models for various Healthcare use cases. It offers diversity in terms of conversation topics, chat types, and outcomes, including both inbound and outbound chats with positive, neutral, and negative outcomes.
•Inbound Chats:
•Appointment Scheduling
•New Patient Registration
•Surgery Consultation
•Consultation regarding Diet, and many more
•Outbound Chats:
•Appointment Reminder
•Health & Wellness Subscription Programs
•Lab Test Results
•Health Risk Assessments
•Preventive Care Reminders, and many more
Language Variety & Nuances
The conversations in this dataset capture the diverse language styles and expressions prevalent in English Healthcare interactions. This diversity ensures the dataset accurately represents the language used by English speakers in Healthcare contexts.
The dataset encompasses a wide array of language elements, including:
•
Naming Conventions: Chats include a variety of English personal and business names.

•
Localized Details: Real-world addresses, emails, phone numbers, and other contact information as according to different English-speaking regions.

•
Temporal and Numeric Expressions: Dates, times, currencies, and numbers in English forms, adhering to local conventions.

•
Idiomatic Expressions and Slang: It includes local slang, idioms, and informal phrase present in English Healthcare conversations.

This linguistic authenticity ensures that the dataset equips researchers and developers with a comprehensive understanding of the intricate language patterns, cultural references, and communication styles inherent to English Healthcare interactions.
Conversational Flow and Interaction Types
The dataset includes a broad range of conversations, from simple inquiries to detailed discussions, capturing the dynamic nature of Healthcare customer-agent interactions.
•Simple Inquiries
•Detailed Discussions
•Transactional Interactions
•Problem-Solving Dialogues
•Advisory Sessions
•Routine Checks and Follow-Ups
Each of these conversations contains various aspects of conversation flow like:
•Greetings
•Authentication
•Information gathering
•Resolution identification
•Solution Delivery
•Closing and Follow-ups
•Feedback, etc
This structured and varied conversational flow enables the creation of advanced NLP models that can effectively manage and respond to a wide range of customer service scenarios.
Data Format and Structure
The dataset is available in JSON, CSV, and TXT formats, with each conversation containing attributes like participant identifiers and chat
A
‘Mental Health FAQ for Chatbot’ analyzed by Analyst-2
analyst-2.ai
Updated Jan 28, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2019). ‘Mental Health FAQ for Chatbot’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-mental-health-faq-for-chatbot-d58c/latest
Explore at:
Dataset updated
Jan 28, 2022
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Mental Health FAQ for Chatbot’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/narendrageek/mental-health-faq-for-chatbot on 28 January 2022.

--- Dataset description provided by original source is as follows ---

Content

Mental health includes our emotional, psychological, and social well-being. Mental health is integral to living a healthy, balanced life. It affects how we think, feel, and act. It also helps determine how we handle stress, relate to others, and make choices. Emotional and mental health is important because it’s a vital part of your life and impacts your thoughts, behaviors and emotions. Being healthy emotionally can promote productivity and effectiveness in activities like work, school or care-giving. It plays an important part in the health of your relationships, and allows you to adapt to changes in your life and cope with adversity. Mental health problems are common but help is available. People with mental health problems can get better and many recover completely.

This dataset consists of FAQs about Mental Health.

Acknowledgements

https://www.thekimfoundation.org/faqs/

https://www.mhanational.org/frequently-asked-questions

https://www.wellnessinmind.org/frequently-asked-questions/

https://www.heretohelp.bc.ca/questions-and-answers

--- Original source retains full ownership of the source dataset ---
F
German Conversation Chat Dataset for Healthcare Domain
futurebeeai.com
wav
Updated Aug 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). German Conversation Chat Dataset for Healthcare Domain [Dataset]. https://www.futurebeeai.com/dataset/text-dataset/german-healthcare-domain-conversation-text-dataset
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Dataset funded by
FutureBeeAI
Description
Introduction
The dataset comprises over 12,000 chat conversations, each focusing on specific Healthcare related topics. Each conversation provides a detailed interaction between a call center agent and a customer, capturing real-life scenarios and language nuances.
•
Participants Details: 200+ native German participants from the FutureBeeAI community.

•
Word Count & Length: Chats are diverse, averaging 300 to 700 words and 50 to 150 turns across both speakers.

Topic Diversity
The chat dataset covers a wide range of conversations on Healthcare topics, ensuring that the dataset is comprehensive and relevant for training and fine-tuning models for various Healthcare use cases. It offers diversity in terms of conversation topics, chat types, and outcomes, including both inbound and outbound chats with positive, neutral, and negative outcomes.
•Inbound Chats:
•Appointment Scheduling
•New Patient Registration
•Surgery Consultation
•Consultation regarding Diet, and many more
•Outbound Chats:
•Appointment Reminder
•Health & Wellness Subscription Programs
•Lab Test Results
•Health Risk Assessments
•Preventive Care Reminders, and many more
Language Variety & Nuances
The conversations in this dataset capture the diverse language styles and expressions prevalent in German Healthcare interactions. This diversity ensures the dataset accurately represents the language used by German speakers in Healthcare contexts.
The dataset encompasses a wide array of language elements, including:
•
Naming Conventions: Chats include a variety of German personal and business names.

•
Localized Details: Real-world addresses, emails, phone numbers, and other contact information as according to different German-speaking regions.

•
Temporal and Numeric Expressions: Dates, times, currencies, and numbers in German forms, adhering to local conventions.

•
Idiomatic Expressions and Slang: It includes local slang, idioms, and informal phrase present in German Healthcare conversations.

This linguistic authenticity ensures that the dataset equips researchers and developers with a comprehensive understanding of the intricate language patterns, cultural references, and communication styles inherent to German Healthcare interactions.
Conversational Flow and Interaction Types
The dataset includes a broad range of conversations, from simple inquiries to detailed discussions, capturing the dynamic nature of Healthcare customer-agent interactions.
•Simple Inquiries
•Detailed Discussions
•Transactional Interactions
•Problem-Solving Dialogues
•Advisory Sessions
•Routine Checks and Follow-Ups
Each of these conversations contains various aspects of conversation flow like:
•Greetings
•Authentication
•Information gathering
•Resolution identification
•Solution Delivery
•Closing and Follow-ups
•Feedback, etc
This structured and varied conversational flow enables the creation of advanced NLP models that can effectively manage and respond to a wide range of customer service scenarios.
Data Format and Structure
The dataset is available in JSON, CSV, and TXT formats, with each conversation containing attributes like participant identifiers and chat messages,
Customer service chatbots use in Germany 2023
statista.com
Updated Jun 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Customer service chatbots use in Germany 2023 [Dataset]. https://www.statista.com/statistics/1395418/customer-service-chatbot-germany/
Explore at:
Dataset updated
Jun 23, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
May 2023
Area covered
Germany
Description
In 2023, roughly ** percent of people in Germany said they would find a customer service chatbot useful for flights and hotels, as well as utility services. ** percent of people were not interested in the help of a chatbot. The rise of chatbots ChatGPT was launched in November 2022, and although chatbots existed prior, it was the first one that allowed users to dictate the length, and style of, as well as direct a conversation. Since this AI technology is so versatile, there are many different purposes for which it can be used. For example, some people use the software to help them understand complex theories they are learning for their studies, whilst others ask the chatbot to plan their meals for the week. Almost ** percent of ChatGPT users were aged 18 to 34 in 2023, whilst only **** percent were over the age of 55. When it comes to creating chatbots companies are facing challenges since the technology is new and highly complex. For most companies, the biggest difficulty is data management. This is due to the fact that so much data is required to train AI programs and when they are used, there is also a huge amount of data generated. Commercial usage of chatbots One industry that has been using chatbots for the past couple of years is the online shopping industry. The most popular function of chatbots among online shoppers globally was searching for product information. This was also the top result for consumers in Germany, followed by customer service and sending of updates about products. However, Germany did have a **************** of chatbots than the global average. Similarly, when it came to the share of those shopping online who considered chatbot customer service useful, Germany also ranked quite low, with only ** percent of respondents stating that they found it useful. Other countries such as India, UAE, and Indonesia had a *********** uptake rate.
F
Swedish Conversation Chat Dataset for BFSI Domain
futurebeeai.com
wav
Updated Aug 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Swedish Conversation Chat Dataset for BFSI Domain [Dataset]. https://www.futurebeeai.com/dataset/text-dataset/swedish-bfsi-domain-conversation-text-dataset
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Dataset funded by
FutureBeeAI
Description
Introduction
The dataset comprises over 10,000 chat conversations, each focusing on specific BFSI-related topics. Each conversation provides a detailed interaction between a call center agent and a customer, capturing real-life scenarios and language nuances.
•
Participants Details: 150+ native Swedish participants from the FutureBeeAI community.

•
Word Count & Length: Chats are diverse, averaging 300 to 700 words and 50 to 150 turns across both speakers.

Topic Diversity
The chat dataset covers a wide range of conversations on BFSI topics, ensuring that the dataset is comprehensive and relevant for training and fine-tuning models for various BFSI use cases. It offers diversity in terms of conversation topics, chat types, and outcomes, including both inbound and outbound chats with positive, neutral, and negative outcomes.
•Inbound Chats:
•Account Opening
•Account Management
•Transactions
•Loan Inquiries & Applications
•Credit Card Services, and many more
•Outbound Chats:
•Product & Service Promotions
•Cross-selling & Upselling
•Customer Retention & Loyalty Programs
•Loan Application Follow-ups
•Insurance Policy Renewals/Reminders, and many more
Language Variety & Nuances
The conversations in this dataset capture the diverse language styles and expressions prevalent in Swedish BFSI interactions. This diversity ensures the dataset accurately represents the language used by Swedish speakers in BFSI contexts.
The dataset encompasses a wide array of language elements, including:
•
Naming Conventions: Chats include a variety of Swedish personal and business names.

•
Localized Details: Real-world addresses, emails, phone numbers, and other contact information as according to different Swedish-speaking regions.

•
Temporal and Numeric Expressions: Dates, times, currencies, and numbers in Swedish forms, adhering to local conventions.

•
Idiomatic Expressions and Slang: It includes local slang, idioms, and informal phrase present in Swedish BFSI conversations.

This linguistic authenticity ensures that the dataset equips researchers and developers with a comprehensive understanding of the intricate language patterns, cultural references, and communication styles inherent to Swedish BFSI interactions.
Conversational Flow and Interaction Types
The dataset includes a broad range of conversations, from simple inquiries to detailed discussions, capturing the dynamic nature of BFSI customer-agent interactions.
•Simple Inquiries
•Detailed Discussions
•Transactional Interactions
•Problem-Solving Dialogues
•Advisory Sessions
•Routine Checks and Follow-Ups
Each of these conversations contains various aspects of conversation flow like:
•Greetings
•Authentication
•Information gathering
•Resolution identification
•Solution Delivery
•Closing and Follow-ups
•Feedback, etc
This structured and varied conversational flow enables the creation of advanced NLP models that can effectively manage and respond to a wide range of customer service scenarios.
Data Format and Structure
<p
F
Hindi Conversation Chat Dataset for Retail & E-commerce Domain
futurebeeai.com
wav
Updated Aug 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Hindi Conversation Chat Dataset for Retail & E-commerce Domain [Dataset]. https://www.futurebeeai.com/dataset/text-dataset/hindi-retail-domain-conversation-text-dataset
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Dataset funded by
FutureBeeAI
Description
Introduction
The dataset comprises over 12,000 chat conversations, each focusing on specific Retail & E-Commerce related topics. Each conversation provides a detailed interaction between a call center agent and a customer, capturing real-life scenarios and language nuances.
•
Participants Details: 200+ native Hindi participants from the FutureBeeAI community.

•
Word Count & Length: Chats are diverse, averaging 300 to 700 words and 50 to 150 turns across both speakers.

Topic Diversity
The chat dataset covers a wide range of conversations on Retail & E-Commerce topics, ensuring that the dataset is comprehensive and relevant for training and fine-tuning models for various Retail & E-Commerce use cases. It offers diversity in terms of conversation topics, chat types, and outcomes, including both inbound and outbound chats with positive, neutral, and negative outcomes.
•Inbound Chats:
•Product Inquiry
•Return/Exchange Request
•Order Cancellation
•Refund Request
•Membership/Subscriptions Enquiry
•Order Cancellations, and many more
•Outbound Chats:
•Order Confirmation
•Cross-selling and Upselling
•Account Updates
•Loyalty Program Offers
•Special Offers and Promotions
•Customer Verification, and many more
Language Variety & Nuances
The conversations in this dataset capture the diverse language styles and expressions prevalent in Hindi Retail & E-Commerce interactions. This diversity ensures the dataset accurately represents the language used by Hindi speakers in Retail & E-Commerce contexts.
The dataset encompasses a wide array of language elements, including:
•
Naming Conventions: Chats include a variety of Hindi personal and business names.

•
Localized Details: Real-world addresses, emails, phone numbers, and other contact information as according to different Hindi-speaking regions.

•
Temporal and Numeric Expressions: Dates, times, currencies, and numbers in Hindi forms, adhering to local conventions.

•
Idiomatic Expressions and Slang: It includes local slang, idioms, and informal phrase present in Hindi Retail & E-Commerce conversations.

This linguistic authenticity ensures that the dataset equips researchers and developers with a comprehensive understanding of the intricate language patterns, cultural references, and communication styles inherent to Hindi Retail & E-Commerce interactions.
Conversational Flow and Interaction Types
The dataset includes a broad range of conversations, from simple inquiries to detailed discussions, capturing the dynamic nature of Retail & E-Commerce customer-agent interactions.
•Simple Inquiries
•Detailed Discussions
•Transactional Interactions
•Problem-Solving Dialogues
•Advisory Sessions
•Routine Checks and Follow-Ups
Each of these conversations contains various aspects of conversation flow like:
•Greetings
•Authentication
•Information gathering
•Resolution identification
•Solution Delivery
•Closing and Follow-ups
<div style="margin-top:10px;
f
Data_Sheet_1_Development and testing of a multi-lingual Natural Language...
frontiersin.figshare.com
docx
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lily Wei Yun Yang; Wei Yan Ng; Xiaofeng Lei; Shaun Chern Yuan Tan; Zhaoran Wang; Ming Yan; Mohan Kashyap Pargi; Xiaoman Zhang; Jane Sujuan Lim; Dinesh Visva Gunasekeran; Franklin Chee Ping Tan; Chen Ee Lee; Khung Keong Yeo; Hiang Khoon Tan; Henry Sun Sien Ho; Benedict Wee Bor Tan; Tien Yin Wong; Kenneth Yung Chiang Kwek; Rick Siow Mong Goh; Yong Liu; Daniel Shu Wei Ting (2023). Data_Sheet_1_Development and testing of a multi-lingual Natural Language Processing-based deep learning system in 10 languages for COVID-19 pandemic crisis: A multi-center study.docx [Dataset]. http://doi.org/10.3389/fpubh.2023.1063466.s001
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.3389/fpubh.2023.1063466.s001
Dataset updated
Jun 1, 2023
Dataset provided by
Frontiers
Authors
Lily Wei Yun Yang; Wei Yan Ng; Xiaofeng Lei; Shaun Chern Yuan Tan; Zhaoran Wang; Ming Yan; Mohan Kashyap Pargi; Xiaoman Zhang; Jane Sujuan Lim; Dinesh Visva Gunasekeran; Franklin Chee Ping Tan; Chen Ee Lee; Khung Keong Yeo; Hiang Khoon Tan; Henry Sun Sien Ho; Benedict Wee Bor Tan; Tien Yin Wong; Kenneth Yung Chiang Kwek; Rick Siow Mong Goh; Yong Liu; Daniel Shu Wei Ting
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
PurposeThe COVID-19 pandemic has drastically disrupted global healthcare systems. With the higher demand for healthcare and misinformation related to COVID-19, there is a need to explore alternative models to improve communication. Artificial Intelligence (AI) and Natural Language Processing (NLP) have emerged as promising solutions to improve healthcare delivery. Chatbots could fill a pivotal role in the dissemination and easy accessibility of accurate information in a pandemic. In this study, we developed a multi-lingual NLP-based AI chatbot, DR-COVID, which responds accurately to open-ended, COVID-19 related questions. This was used to facilitate pandemic education and healthcare delivery.MethodsFirst, we developed DR-COVID with an ensemble NLP model on the Telegram platform (https://t.me/drcovid_nlp_chatbot). Second, we evaluated various performance metrics. Third, we evaluated multi-lingual text-to-text translation to Chinese, Malay, Tamil, Filipino, Thai, Japanese, French, Spanish, and Portuguese. We utilized 2,728 training questions and 821 test questions in English. Primary outcome measurements were (A) overall and top 3 accuracies; (B) Area Under the Curve (AUC), precision, recall, and F1 score. Overall accuracy referred to a correct response for the top answer, whereas top 3 accuracy referred to an appropriate response for any one answer amongst the top 3 answers. AUC and its relevant matrices were obtained from the Receiver Operation Characteristics (ROC) curve. Secondary outcomes were (A) multi-lingual accuracy; (B) comparison to enterprise-grade chatbot systems. The sharing of training and testing datasets on an open-source platform will also contribute to existing data.ResultsOur NLP model, utilizing the ensemble architecture, achieved overall and top 3 accuracies of 0.838 [95% confidence interval (CI): 0.826–0.851] and 0.922 [95% CI: 0.913–0.932] respectively. For overall and top 3 results, AUC scores of 0.917 [95% CI: 0.911–0.925] and 0.960 [95% CI: 0.955–0.964] were achieved respectively. We achieved multi-linguicism with nine non-English languages, with Portuguese performing the best overall at 0.900. Lastly, DR-COVID generated answers more accurately and quickly than other chatbots, within 1.12–2.15 s across three devices tested.ConclusionDR-COVID is a clinically effective NLP-based conversational AI chatbot, and a promising solution for healthcare delivery in the pandemic era.
h
Bitext-retail-ecommerce-llm-chatbot-training-dataset
huggingface.co
Updated Aug 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bitext (2024). Bitext-retail-ecommerce-llm-chatbot-training-dataset [Dataset]. https://huggingface.co/datasets/bitext/Bitext-retail-ecommerce-llm-chatbot-training-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 6, 2024
Dataset authored and provided by
Bitext
License
https://choosealicense.com/licenses/cdla-sharing-1.0/https://choosealicense.com/licenses/cdla-sharing-1.0/
Description
Bitext - Retail (eCommerce) Tagged Training Dataset for LLM-based Virtual Assistants

Overview

This hybrid synthetic dataset is designed to be used to fine-tune Large Language Models such as GPT, Mistral and OpenELM, and has been generated using our NLP/NLG technology and our automated Data Labeling (DAL) tools. The goal is to demonstrate how Verticalization/Domain Adaptation for the [Retail (eCommerce)] sector can be easily achieved using our two-step approach to LLM… See the full description on the dataset page: https://huggingface.co/datasets/bitext/Bitext-retail-ecommerce-llm-chatbot-training-dataset.
LLM Influence on Medical Diagnostic Reasoning
kaggle.com
Updated Dec 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Patrick L Ford (2024). LLM Influence on Medical Diagnostic Reasoning [Dataset]. http://doi.org/10.34740/kaggle/dsv/10119916
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/10119916
Dataset updated
Dec 6, 2024
Dataset provided by
Kaggle
Authors
Patrick L Ford
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Introduction

A new study published in JAMA Network Open revealed that ChatGPT-4 outperformed doctors in diagnosing medical conditions from case reports. The AI chatbot scored an average of 92% in the study, while doctors using the chatbot scored 76% and those without it scored 74%.

The study involved 50 doctors (26 attending, 24 residents; median years in practice, 3 [IQR, 2-8]) who were given six case histories and graded on their ability to suggest diagnoses and explain their reasoning. The results showed that doctors often stuck to their initial diagnoses even when the chatbot suggested a better one, highlighting an overconfidence bias. Additionally, many doctors didn't fully utilise the chatbot's capabilities, treating it like a search engine instead of leveraging its ability to analyse full case histories.

The study raises questions about how doctors think and how AI tools can be best integrated into medical practice. While AI has the potential to be a "doctor extender," providing valuable second opinions, the study suggests that more training and a shift in mindset may be needed for doctors to fully embrace and benefit from these advancements. link

Study Findings

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13231939%2F4e4c6a4ce9f191ab32e660c726c5204f%2FScreenshot%202024-12-05%2013.33.30.png?generation=1733490846716451&alt=media" alt="">

Visualisation

The study compares the diagnostic reasoning performance of physicians using a commercial LLM AI chatbot (ChatGPT Plus [GPT-4]: OpenAl) compared with conventional diagnostic resources (eg, UpToDate, Google): - ***Conventional Resources*-Only Group (Doctor on Own):** This group refers to doctors using only conventional resources (likely standard medical tools and knowledge) without the assistance of an LLM (large language model). - Doctor With LLM Group: This group involves doctors using conventional resources along with an LLM, which could be a tool or AI assistant helping with diagnostic reasoning. - ***LLM Alone* Group:** This group refers to the use of the LLM on its own, without any conventional resources or doctor intervention.

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13231939%2F7360932a01d641b6adc3594b2e5cae11%2FScreenshot%202024-12-06%2012.11.05.png?generation=1733490890087478&alt=media" alt="">

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13231939%2F7e14a7c648febf04ac657f8dc51ea796%2FScreenshot%202024-12-06%2012.11.58.png?generation=1733490908679868&alt=media" alt="">

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13231939%2F9b9d165a7c69b1a5624186b7904c46c0%2FScreenshot%202024-12-06%2012.12.41.png?generation=1733490932343833&alt=media" alt="">

A Markdown document with the R code for the above plots. link

Conclusion

This study reveals a fascinating and potentially transformative dynamic between artificial intelligence and human medical expertise. While ChatGPT-4 demonstrated remarkable diagnostic accuracy, surpassing even experienced physicians, the study also highlighted critical challenges in integrating AI into clinical practice.

The findings suggest that: - AI can significantly enhance diagnostic accuracy: LLMs like ChatGPT-4 have the potential to revolutionise how medical diagnoses are made, offering a level of accuracy exceeding current practices. - Human factors remain crucial: Overconfidence bias and under-utilisation of AI tools by physicians underscore the need for training and a shift in mindset to effectively leverage these advancements. Doctors must learn to collaborate with AI, viewing it as a powerful partner rather than a simple search engine. - Further research is needed: This study provides a crucial starting point for further investigation into the optimal integration of AI into healthcare. Future research should explore: - Effective training methods for physicians to utilise AI tools. - The impact of AI assistance on patient outcomes. - Ethical considerations surrounding the use of AI in medicine. - The potential for AI to address healthcare disparities.

Ultimately, the successful integration of AI into healthcare will depend not only on technological advancements but also on a willingness among medical professionals to embrace new ways of thinking and working. By harnessing the power of AI while recognising the essential role of human expertise, we can strive towards a future where medical care is more accurate, efficient, and accessible for all.

Patrick Ford 🥼🩺🖥
h
lmsys-chat-1m
huggingface.co
opendatalab.com
Updated Sep 17, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Large Model Systems Organization (2023). lmsys-chat-1m [Dataset]. https://huggingface.co/datasets/lmsys/lmsys-chat-1m
Explore at:
Dataset updated
Sep 17, 2023
Dataset authored and provided by
Large Model Systems Organization
Description
LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset

This dataset contains one million real-world conversations with 25 state-of-the-art LLMs. It is collected from 210K unique IP addresses in the wild on the Vicuna demo and Chatbot Arena website from April to August 2023. Each sample includes a conversation ID, model name, conversation text in OpenAI API JSON format, detected language tag, and OpenAI moderation API tag. User consent is obtained through the "Terms of use"… See the full description on the dataset page: https://huggingface.co/datasets/lmsys/lmsys-chat-1m.

Facebook

Twitter

Click to copy link

Link copied

Cite

Ruslan Magana Vsevolodovna (2024). ai-medical-chatbot [Dataset]. https://huggingface.co/datasets/ruslanmv/ai-medical-chatbot

ai-medical-chatbot

ruslanmv/ai-medical-chatbot

Explore at:

3 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Feb 16, 2024

Authors

Ruslan Magana Vsevolodovna

Description

AI Medical Chatbot Dataset

This is an experimental Dataset designed to run a Medical Chatbot It contains at least 250k dialogues between a Patient and a Doctor.

  Playground ChatBot

ruslanmv/AI-Medical-Chatbot For furter information visit the project here: https://github.com/ruslanmv/ai-medical-chatbot

Clear search

Close search

Google apps

Main menu

ai-medical-chatbot

chatbot_arena_conversations

Mental Health Conversational Data

Bitext-travel-llm-chatbot-training-dataset

Hindi Conversation Chat Dataset for Real Estate Domain

Introduction

Topic Diversity

Language Variety & Nuances

Conversational Flow and Interaction Types

Gujarati Conversation Chat Dataset for Travel Domain

Introduction

Topic Diversity

Language Variety & Nuances

Conversational Flow and Interaction Types

Bahasa Conversation Chat Dataset for Telecom Domain

Introduction

Topic Diversity

Language Variety & Nuances

Conversational Flow and Interaction Types

French Conversations (from movie subtitles)

French Movie Subtitle Conversations Dataset

Description

Content Overview

Data Sample

Use Cases

Why This Dataset

Acknowledgments

mental_health_chatbot_dataset

ExpBot - A dataset of 79 dialogs with an experimental customer service...

English Conversation Chat Dataset for Healthcare Domain

Introduction

Topic Diversity

Language Variety & Nuances

Conversational Flow and Interaction Types

Data Format and Structure

‘Mental Health FAQ for Chatbot’ analyzed by Analyst-2

Content

Acknowledgements

German Conversation Chat Dataset for Healthcare Domain

Introduction

Topic Diversity

Language Variety & Nuances

Conversational Flow and Interaction Types

Data Format and Structure

Customer service chatbots use in Germany 2023

Swedish Conversation Chat Dataset for BFSI Domain

Introduction

Topic Diversity

Language Variety & Nuances

Conversational Flow and Interaction Types

Data Format and Structure

Hindi Conversation Chat Dataset for Retail & E-commerce Domain

Introduction

Topic Diversity

Language Variety & Nuances

Conversational Flow and Interaction Types

Data_Sheet_1_Development and testing of a multi-lingual Natural Language...

Bitext-retail-ecommerce-llm-chatbot-training-dataset

LLM Influence on Medical Diagnostic Reasoning

Introduction

Study Findings

Visualisation

Conclusion

lmsys-chat-1m

ai-medical-chatbot

ruslanmv/ai-medical-chatbot