100+ datasets found
  1. h

    ai-medical-chatbot

    • huggingface.co
    Updated Feb 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ruslan Magana Vsevolodovna (2024). ai-medical-chatbot [Dataset]. https://huggingface.co/datasets/ruslanmv/ai-medical-chatbot
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 16, 2024
    Authors
    Ruslan Magana Vsevolodovna
    Description

    AI Medical Chatbot Dataset

    This is an experimental Dataset designed to run a Medical Chatbot It contains at least 250k dialogues between a Patient and a Doctor.

      Playground ChatBot
    

    ruslanmv/AI-Medical-Chatbot For furter information visit the project here: https://github.com/ruslanmv/ai-medical-chatbot

  2. h

    chatbot_arena_conversations

    • huggingface.co
    Updated Jul 18, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Large Model Systems Organization (2023). chatbot_arena_conversations [Dataset]. https://huggingface.co/datasets/lmsys/chatbot_arena_conversations
    Explore at:
    Dataset updated
    Jul 18, 2023
    Dataset authored and provided by
    Large Model Systems Organization
    License

    https://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/

    Description

    Chatbot Arena Conversations Dataset

    This dataset contains 33K cleaned conversations with pairwise human preferences. It is collected from 13K unique IP addresses on the Chatbot Arena from April to June 2023. Each sample includes a question ID, two model names, their full conversation text in OpenAI API JSON format, the user vote, the anonymized user ID, the detected language tag, the OpenAI moderation API tag, the additional toxic tag, and the timestamp. To ensure the safe release… See the full description on the dataset page: https://huggingface.co/datasets/lmsys/chatbot_arena_conversations.

  3. Mental Health Conversational Data

    • kaggle.com
    Updated Oct 31, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    elvis (2022). Mental Health Conversational Data [Dataset]. https://www.kaggle.com/datasets/elvis23/mental-health-conversational-data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 31, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    elvis
    Description

    A dataset containing basic conversations, mental health FAQ, classical therapy conversations, and general advice provided to people suffering from anxiety and depression.

    This dataset can be used to train a model for a chatbot that can behave like a therapist in order to provide emotional support to people with anxiety & depression.

    The dataset contains intents. An “intent” is the intention behind a user's message. For instance, If I were to say “I am sad” to the chatbot, the intent, in this case, would be “sad”. Depending upon the intent, there is a set of Patterns and Responses appropriate for the intent. Patterns are some examples of a user’s message which aligns with the intent while Responses are the replies that the chatbot provides in accordance with the intent. Various intents are defined and their patterns and responses are used as the model’s training data to identify a particular intent.

  4. h

    Bitext-travel-llm-chatbot-training-dataset

    • huggingface.co
    Updated Jun 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bitext (2025). Bitext-travel-llm-chatbot-training-dataset [Dataset]. https://huggingface.co/datasets/bitext/Bitext-travel-llm-chatbot-training-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 21, 2025
    Dataset authored and provided by
    Bitext
    License

    https://choosealicense.com/licenses/cdla-sharing-1.0/https://choosealicense.com/licenses/cdla-sharing-1.0/

    Description

    Bitext - Travel Tagged Training Dataset for LLM-based Virtual Assistants

      Overview
    

    This hybrid synthetic dataset is designed to be used to fine-tune Large Language Models such as GPT, Mistral and OpenELM, and has been generated using our NLP/NLG technology and our automated Data Labeling (DAL) tools. The goal is to demonstrate how Verticalization/Domain Adaptation for the [Travel] sector can be easily achieved using our two-step approach to LLM Fine-Tuning. An overview of… See the full description on the dataset page: https://huggingface.co/datasets/bitext/Bitext-travel-llm-chatbot-training-dataset.

  5. F

    Hindi Conversation Chat Dataset for Real Estate Domain

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Hindi Conversation Chat Dataset for Real Estate Domain [Dataset]. https://www.futurebeeai.com/dataset/text-dataset/hindi-realestate-domain-conversation-text-dataset
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    The dataset comprises over 12,000 chat conversations, each focusing on specific Real Estate related topics. Each conversation provides a detailed interaction between a call center agent and a customer, capturing real-life scenarios and language nuances.

    Participants Details: 200+ native Hindi participants from the FutureBeeAI community.
    Word Count & Length: Chats are diverse, averaging 300 to 700 words and 50 to 150 turns across both speakers.

    Topic Diversity

    The chat dataset covers a wide range of conversations on Real Estate topics, ensuring that the dataset is comprehensive and relevant for training and fine-tuning models for various Real Estate use cases. It offers diversity in terms of conversation topics, chat types, and outcomes, including both inbound and outbound chats with positive, neutral, and negative outcomes.

    Inbound Chats:
    Property Inquiry
    Rental Property Search & Availability
    Renovation Inquiries
    Property Features & Amenities Inquiry
    Investment Property Analysis & Advice
    Property History & Ownership Details, and many more
    Outbound Chats:
    New Property Listing Update
    Post Purchase Follow-ups
    Investment Opportunities & Property Recommendations
    Property Value Updates
    Customer Satisfaction Surveys, and many more

    Language Variety & Nuances

    The conversations in this dataset capture the diverse language styles and expressions prevalent in Hindi Real Estate interactions. This diversity ensures the dataset accurately represents the language used by Hindi speakers in Real Estate contexts.

    The dataset encompasses a wide array of language elements, including:

    Naming Conventions: Chats include a variety of Hindi personal and business names.
    Localized Details: Real-world addresses, emails, phone numbers, and other contact information as according to different Hindi-speaking regions.
    Temporal and Numeric Expressions: Dates, times, currencies, and numbers in Hindi forms, adhering to local conventions.
    Idiomatic Expressions and Slang: It includes local slang, idioms, and informal phrase present in Hindi Real Estate conversations.

    This linguistic authenticity ensures that the dataset equips researchers and developers with a comprehensive understanding of the intricate language patterns, cultural references, and communication styles inherent to Hindi Real Estate interactions.

    Conversational Flow and Interaction Types

    The dataset includes a broad range of conversations, from simple inquiries to detailed discussions, capturing the dynamic nature of Real Estate customer-agent interactions.

    Simple Inquiries
    Detailed Discussions
    Transactional Interactions
    Problem-Solving Dialogues
    Advisory Sessions
    Routine Checks and Follow-Ups

    Each of these conversations contains various aspects of conversation flow like:

    Greetings
    Authentication
    Information gathering
    Resolution identification
    Solution Delivery
    Closing and Follow-ups
    <span

  6. F

    Gujarati Conversation Chat Dataset for Travel Domain

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Gujarati Conversation Chat Dataset for Travel Domain [Dataset]. https://www.futurebeeai.com/dataset/text-dataset/gujarati-travel-domain-conversation-text-dataset
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    The dataset comprises over 12,000 chat conversations, each focusing on specific Travel related topics. Each conversation provides a detailed interaction between a call center agent and a customer, capturing real-life scenarios and language nuances.

    Participants Details: 200+ native Gujarati participants from the FutureBeeAI community.
    Word Count & Length: Chats are diverse, averaging 300 to 700 words and 50 to 150 turns across both speakers.

    Topic Diversity

    The chat dataset covers a wide range of conversations on Travel topics, ensuring that the dataset is comprehensive and relevant for training and fine-tuning models for various Travel use cases. It offers diversity in terms of conversation topics, chat types, and outcomes, including both inbound and outbound chats with positive, neutral, and negative outcomes.

    Inbound Calls:
    Booking Inquiries & Assistance
    Destination Information & Recommendations
    Flight Delays or Cancellation Assistance
    Assistance for Disable Passengers
    Travel-related Health & Safety Inquiry
    Lost or Delayed Baggage Assistance, and many more
    Outbound Calls:
    Promotional Offers & Package Deals
    Customer Satisfaction Surveys
    Booking Confirmations & Updates
    Flight Schedule Changes & Notifications
    Customer Feedback Collection
    Visa Expiration Reminders, and many more

    Language Variety & Nuances

    The conversations in this dataset capture the diverse language styles and expressions prevalent in Gujarati Travel interactions. This diversity ensures the dataset accurately represents the language used by Gujarati speakers in Travel contexts.

    The dataset encompasses a wide array of language elements, including:

    Naming Conventions: Chats include a variety of Gujarati personal and business names.
    Localized Details: Real-world addresses, emails, phone numbers, and other contact information as according to different Gujarati-speaking regions.
    Temporal and Numeric Expressions: Dates, times, currencies, and numbers in Gujarati forms, adhering to local conventions.
    Idiomatic Expressions and Slang: It includes local slang, idioms, and informal phrase present in Gujarati Travel conversations.

    This linguistic authenticity ensures that the dataset equips researchers and developers with a comprehensive understanding of the intricate language patterns, cultural references, and communication styles inherent to Gujarati Travel interactions.

    Conversational Flow and Interaction Types

    The dataset includes a broad range of conversations, from simple inquiries to detailed discussions, capturing the dynamic nature of Travel customer-agent interactions.

    Simple Inquiries
    Detailed Discussions
    Transactional Interactions
    Problem-Solving Dialogues
    Advisory Sessions
    Routine Checks and Follow-Ups

    Each of these conversations contains various aspects of conversation flow like:

    Greetings
    Authentication
    Information gathering
    Resolution identification
    Solution Delivery
    <span

  7. F

    Bahasa Conversation Chat Dataset for Telecom Domain

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Bahasa Conversation Chat Dataset for Telecom Domain [Dataset]. https://www.futurebeeai.com/dataset/text-dataset/bahasa-telecom-domain-conversation-text-dataset
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    The dataset comprises over 10,000 chat conversations, each focusing on specific Telecom related topics. Each conversation provides a detailed interaction between a call center agent and a customer, capturing real-life scenarios and language nuances.

    Participants Details: 150+ native Bahasa participants from the FutureBeeAI community.
    Word Count & Length: Chats are diverse, averaging 300 to 700 words and 50 to 150 turns across both speakers.

    Topic Diversity

    The chat dataset covers a wide range of conversations on Telecom topics, ensuring that the dataset is comprehensive and relevant for training and fine-tuning models for various Telecom use cases. It offers diversity in terms of conversation topics, chat types, and outcomes, including both inbound and outbound chats with positive, neutral, and negative outcomes.

    Inbound Chats:
    Phone Number Porting
    Network Connectivity Issues
    Billing and Payments
    Technical Support
    Service Activation
    International Roaming Enquiry
    Refunds and Billing Adjustments
    Emergency Service Access, and many more
    Outbound Chats:
    Welcome Calls / Onboarding Process
    Payment Reminders
    Customer Surveys
    Technical Updates
    Service Usage Reviews
    Network Complaint Update, and many more

    Language Variety & Nuances

    The conversations in this dataset capture the diverse language styles and expressions prevalent in Bahasa Telecom interactions. This diversity ensures the dataset accurately represents the language used by Bahasa speakers in Telecom contexts.

    The dataset encompasses a wide array of language elements, including:

    Naming Conventions: Chats include a variety of Bahasa personal and business names.
    Localized Details: Real-world addresses, emails, phone numbers, and other contact information as according to different Bahasa-speaking regions.
    Temporal and Numeric Expressions: Dates, times, currencies, and numbers in Bahasa forms, adhering to local conventions.
    Idiomatic Expressions and Slang: It includes local slang, idioms, and informal phrase present in Bahasa Telecom conversations.

    This linguistic authenticity ensures that the dataset equips researchers and developers with a comprehensive understanding of the intricate language patterns, cultural references, and communication styles inherent to Bahasa Telecom interactions.

    Conversational Flow and Interaction Types

    The dataset includes a broad range of conversations, from simple inquiries to detailed discussions, capturing the dynamic nature of Telecom customer-agent interactions.

    Simple Inquiries
    Detailed Discussions
    Transactional Interactions
    Problem-Solving Dialogues
    Advisory Sessions
    Routine Checks and Follow-Ups

    Each of these conversations contains various aspects of conversation flow like:

    Greetings
    Authentication
    Information gathering
    Resolution identification
    <span

  8. French Conversations (from movie subtitles)

    • kaggle.com
    Updated Aug 3, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dali Selmi (2023). French Conversations (from movie subtitles) [Dataset]. https://www.kaggle.com/datasets/daliselmi/french-conversational-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 3, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Dali Selmi
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    French
    Description

    French Movie Subtitle Conversations Dataset

    Description

    Dive into the world of French dialogue with the French Movie Subtitle Conversations dataset – a comprehensive collection of over 127,000 movie subtitle conversations. This dataset offers a deep exploration of authentic and diverse conversational contexts spanning various genres, eras, and scenarios. It is thoughtfully organized into three distinct sets: training, testing, and validation.

    Content Overview

    Each conversation in this dataset is structured as a JSON object, featuring three key attributes:

    1. Context: Get a holistic view of the conversation's flow with the preceding 9 lines of dialogue. This context provides invaluable insights into the conversation's dynamics and contextual cues.
    2. Knowledge: Immerse yourself in a wide range of thematic knowledge. This dataset covers an array of topics, ensuring that your models receive exposure to diverse information sources for generating well-informed responses.
    3. Response: Explore how characters react and respond across various scenarios. From casual conversations to intense emotional exchanges, this dataset encapsulates the authenticity of genuine human interaction.

    Data Sample

    Here's a snippet from the dataset to give you an idea of its structure:

    [
     {
      "context": [
       "Tu as attendu longtemps?",
       "Oui en effet.",
       "Je pense que c' est grossier pour un premier rencard.",
       // ... (6 more lines of context)
      ],
      "knowledge": "",
      "response": "On n' avait pas dit 9h?"
     },
     // ... (more data samples)
    ]
    

    Use Cases

    The French Movie Subtitle Conversations dataset serves as a valuable resource for several applications:

    • Conversational AI: Train advanced chatbots and dialogue systems in French that can engage users in fluid, contextually aware conversations.
    • Language Modeling: Enhance your language models by leveraging diverse dialogue patterns, colloquialisms, and contextual dependencies present in real-world conversations.
    • Sentiment Analysis: Investigate the emotional tones of conversations across different movie genres and periods, contributing to a better understanding of sentiment variation.

    Why This Dataset

    • Size and Diversity: With a vast collection of over 127,000 conversations spanning diverse genres and tones, this dataset offers an unparalleled breadth and depth in French dialogue data.
    • Contextual Richness: The inclusion of context empowers researchers and practitioners to explore the dynamics of conversation flow, leading to more accurate and contextually relevant responses.
    • Real-world Relevance: Originating from movie subtitles, this dataset mirrors real-world interactions, making it a valuable asset for training models that understand and generate human-like dialogue.

    Acknowledgments

    We extend our gratitude to the movie subtitle community for their contributions, which have enabled the creation of this diverse and comprehensive French dialogue dataset.

    Unlock the potential of authentic French conversations today with the French Movie Subtitle Conversations dataset. Engage in state-of-the-art research, enhance language models, and create applications that resonate with the nuances of real dialogue.

  9. h

    mental_health_chatbot_dataset

    • huggingface.co
    • opendatalab.com
    Updated Jul 21, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arun Brahma (2023). mental_health_chatbot_dataset [Dataset]. https://huggingface.co/datasets/heliosbrahma/mental_health_chatbot_dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 21, 2023
    Authors
    Arun Brahma
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Card for "heliosbrahma/mental_health_chatbot_dataset"

      Dataset Description
    
    
    
    
    
      Dataset Summary
    

    This dataset contains conversational pair of questions and answers in a single text related to Mental Health. Dataset was curated from popular healthcare blogs like WebMD, Mayo Clinic and HeatlhLine, online FAQs etc. All questions and answers have been anonymized to remove any PII data and pre-processed to remove any unwanted characters.

      Languages
    

    The… See the full description on the dataset page: https://huggingface.co/datasets/heliosbrahma/mental_health_chatbot_dataset.

  10. k

    ExpBot - A dataset of 79 dialogs with an experimental customer service...

    • radar.kit.edu
    • radar-service.eu
    tar
    Updated Jun 21, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Mädche; Jasper Feine; Stefan Morana; Ulrich Gnewuch (2023). ExpBot - A dataset of 79 dialogs with an experimental customer service chatbot [Dataset]. http://doi.org/10.35097/1210
    Explore at:
    tar(251904 bytes)Available download formats
    Dataset updated
    Jun 21, 2023
    Dataset provided by
    Karlsruhe Institute of Technology
    Gnewuch, Ulrich
    Mädche, Alexander
    Feine, Jasper
    Morana, Stefan
    Authors
    Alexander Mädche; Jasper Feine; Stefan Morana; Ulrich Gnewuch
    Description

    This dataset consists of 79 dialogs between a human user and a chatbot in English language. This data was collected during an online experiment conducted by the research group "Information Systems & Service Design" at the Karlsruhe Institute of Technology (KIT). Experimental task: Participants were asked to interact with a chatbot to find out whether they could save money by switching to a better mobile phone plan. Additionally, there were shown a fictitious copy of last month's mobile phone bill. During the conversation, the chatbot asked about the participant's usage patterns (e.g., how much data was used) and recommended a randomly generated plan that better met the participant’s requirements. For more information, see Gnewuch et al. (2018). If you have any questions, please contact us via email (info@chatbotresearch.com) or visit https://chatbotresearch.com. WARNING! Some dialogs contain profanity and/or offensive language. Profanity was not removed because it is important for calculating sentiment scores. PUBLICATIONS / REFERENCES Gnewuch, U., Morana, S., Adam, M. T. P., and Maedche, A. 2018. “Faster Is Not Always Better: Understanding the Effect of Dynamic Response Delays in Human-Chatbot Interaction,” in Proceedings of the 26th European Conference on Information Systems (ECIS 2018), Portsmouth, United Kingdom. Feine, J., Morana, S., and Gnewuch, U. 2019. “Measuring Service Encounter Satisfaction with Customer Service Chatbots using Sentiment Analysis,” in Proceedings of the 14th International Conference on Wirtschaftsinformatik (WI 2019), Siegen, Germany, February 24–27.

  11. F

    English Conversation Chat Dataset for Healthcare Domain

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). English Conversation Chat Dataset for Healthcare Domain [Dataset]. https://www.futurebeeai.com/dataset/text-dataset/english-healthcare-domain-conversation-text-dataset
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    The dataset comprises over 12,000 chat conversations, each focusing on specific Healthcare related topics. Each conversation provides a detailed interaction between a call center agent and a customer, capturing real-life scenarios and language nuances.

    Participants Details: 200+ native English participants from the FutureBeeAI community.
    Word Count & Length: Chats are diverse, averaging 300 to 700 words and 50 to 150 turns across both speakers.

    Topic Diversity

    The chat dataset covers a wide range of conversations on Healthcare topics, ensuring that the dataset is comprehensive and relevant for training and fine-tuning models for various Healthcare use cases. It offers diversity in terms of conversation topics, chat types, and outcomes, including both inbound and outbound chats with positive, neutral, and negative outcomes.

    Inbound Chats:
    Appointment Scheduling
    New Patient Registration
    Surgery Consultation
    Consultation regarding Diet, and many more
    Outbound Chats:
    Appointment Reminder
    Health & Wellness Subscription Programs
    Lab Test Results
    Health Risk Assessments
    Preventive Care Reminders, and many more

    Language Variety & Nuances

    The conversations in this dataset capture the diverse language styles and expressions prevalent in English Healthcare interactions. This diversity ensures the dataset accurately represents the language used by English speakers in Healthcare contexts.

    The dataset encompasses a wide array of language elements, including:

    Naming Conventions: Chats include a variety of English personal and business names.
    Localized Details: Real-world addresses, emails, phone numbers, and other contact information as according to different English-speaking regions.
    Temporal and Numeric Expressions: Dates, times, currencies, and numbers in English forms, adhering to local conventions.
    Idiomatic Expressions and Slang: It includes local slang, idioms, and informal phrase present in English Healthcare conversations.

    This linguistic authenticity ensures that the dataset equips researchers and developers with a comprehensive understanding of the intricate language patterns, cultural references, and communication styles inherent to English Healthcare interactions.

    Conversational Flow and Interaction Types

    The dataset includes a broad range of conversations, from simple inquiries to detailed discussions, capturing the dynamic nature of Healthcare customer-agent interactions.

    Simple Inquiries
    Detailed Discussions
    Transactional Interactions
    Problem-Solving Dialogues
    Advisory Sessions
    Routine Checks and Follow-Ups

    Each of these conversations contains various aspects of conversation flow like:

    Greetings
    Authentication
    Information gathering
    Resolution identification
    Solution Delivery
    Closing and Follow-ups
    Feedback, etc

    This structured and varied conversational flow enables the creation of advanced NLP models that can effectively manage and respond to a wide range of customer service scenarios.

    Data Format and Structure

    The dataset is available in JSON, CSV, and TXT formats, with each conversation containing attributes like participant identifiers and chat

  12. A

    ‘Mental Health FAQ for Chatbot’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Jan 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2019). ‘Mental Health FAQ for Chatbot’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-mental-health-faq-for-chatbot-d58c/latest
    Explore at:
    Dataset updated
    Jan 28, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Mental Health FAQ for Chatbot’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/narendrageek/mental-health-faq-for-chatbot on 28 January 2022.

    --- Dataset description provided by original source is as follows ---

    Content

    Mental health includes our emotional, psychological, and social well-being. Mental health is integral to living a healthy, balanced life. It affects how we think, feel, and act. It also helps determine how we handle stress, relate to others, and make choices. Emotional and mental health is important because it’s a vital part of your life and impacts your thoughts, behaviors and emotions. Being healthy emotionally can promote productivity and effectiveness in activities like work, school or care-giving. It plays an important part in the health of your relationships, and allows you to adapt to changes in your life and cope with adversity. Mental health problems are common but help is available. People with mental health problems can get better and many recover completely.

    This dataset consists of FAQs about Mental Health.

    Acknowledgements

    https://www.thekimfoundation.org/faqs/

    https://www.mhanational.org/frequently-asked-questions

    https://www.wellnessinmind.org/frequently-asked-questions/

    https://www.heretohelp.bc.ca/questions-and-answers

    --- Original source retains full ownership of the source dataset ---

  13. F

    German Conversation Chat Dataset for Healthcare Domain

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). German Conversation Chat Dataset for Healthcare Domain [Dataset]. https://www.futurebeeai.com/dataset/text-dataset/german-healthcare-domain-conversation-text-dataset
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    The dataset comprises over 12,000 chat conversations, each focusing on specific Healthcare related topics. Each conversation provides a detailed interaction between a call center agent and a customer, capturing real-life scenarios and language nuances.

    Participants Details: 200+ native German participants from the FutureBeeAI community.
    Word Count & Length: Chats are diverse, averaging 300 to 700 words and 50 to 150 turns across both speakers.

    Topic Diversity

    The chat dataset covers a wide range of conversations on Healthcare topics, ensuring that the dataset is comprehensive and relevant for training and fine-tuning models for various Healthcare use cases. It offers diversity in terms of conversation topics, chat types, and outcomes, including both inbound and outbound chats with positive, neutral, and negative outcomes.

    Inbound Chats:
    Appointment Scheduling
    New Patient Registration
    Surgery Consultation
    Consultation regarding Diet, and many more
    Outbound Chats:
    Appointment Reminder
    Health & Wellness Subscription Programs
    Lab Test Results
    Health Risk Assessments
    Preventive Care Reminders, and many more

    Language Variety & Nuances

    The conversations in this dataset capture the diverse language styles and expressions prevalent in German Healthcare interactions. This diversity ensures the dataset accurately represents the language used by German speakers in Healthcare contexts.

    The dataset encompasses a wide array of language elements, including:

    Naming Conventions: Chats include a variety of German personal and business names.
    Localized Details: Real-world addresses, emails, phone numbers, and other contact information as according to different German-speaking regions.
    Temporal and Numeric Expressions: Dates, times, currencies, and numbers in German forms, adhering to local conventions.
    Idiomatic Expressions and Slang: It includes local slang, idioms, and informal phrase present in German Healthcare conversations.

    This linguistic authenticity ensures that the dataset equips researchers and developers with a comprehensive understanding of the intricate language patterns, cultural references, and communication styles inherent to German Healthcare interactions.

    Conversational Flow and Interaction Types

    The dataset includes a broad range of conversations, from simple inquiries to detailed discussions, capturing the dynamic nature of Healthcare customer-agent interactions.

    Simple Inquiries
    Detailed Discussions
    Transactional Interactions
    Problem-Solving Dialogues
    Advisory Sessions
    Routine Checks and Follow-Ups

    Each of these conversations contains various aspects of conversation flow like:

    Greetings
    Authentication
    Information gathering
    Resolution identification
    Solution Delivery
    Closing and Follow-ups
    Feedback, etc

    This structured and varied conversational flow enables the creation of advanced NLP models that can effectively manage and respond to a wide range of customer service scenarios.

    Data Format and Structure

    The dataset is available in JSON, CSV, and TXT formats, with each conversation containing attributes like participant identifiers and chat messages,

  14. Customer service chatbots use in Germany 2023

    • statista.com
    Updated Jun 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Customer service chatbots use in Germany 2023 [Dataset]. https://www.statista.com/statistics/1395418/customer-service-chatbot-germany/
    Explore at:
    Dataset updated
    Jun 23, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    May 2023
    Area covered
    Germany
    Description

    In 2023, roughly ** percent of people in Germany said they would find a customer service chatbot useful for flights and hotels, as well as utility services. ** percent of people were not interested in the help of a chatbot. The rise of chatbots ChatGPT was launched in November 2022, and although chatbots existed prior, it was the first one that allowed users to dictate the length, and style of, as well as direct a conversation. Since this AI technology is so versatile, there are many different purposes for which it can be used. For example, some people use the software to help them understand complex theories they are learning for their studies, whilst others ask the chatbot to plan their meals for the week. Almost ** percent of ChatGPT users were aged 18 to 34 in 2023, whilst only **** percent were over the age of 55. When it comes to creating chatbots companies are facing challenges since the technology is new and highly complex. For most companies, the biggest difficulty is data management. This is due to the fact that so much data is required to train AI programs and when they are used, there is also a huge amount of data generated. Commercial usage of chatbots One industry that has been using chatbots for the past couple of years is the online shopping industry. The most popular function of chatbots among online shoppers globally was searching for product information. This was also the top result for consumers in Germany, followed by customer service and sending of updates about products. However, Germany did have a **************** of chatbots than the global average. Similarly, when it came to the share of those shopping online who considered chatbot customer service useful, Germany also ranked quite low, with only ** percent of respondents stating that they found it useful. Other countries such as India, UAE, and Indonesia had a *********** uptake rate.

  15. F

    Swedish Conversation Chat Dataset for BFSI Domain

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Swedish Conversation Chat Dataset for BFSI Domain [Dataset]. https://www.futurebeeai.com/dataset/text-dataset/swedish-bfsi-domain-conversation-text-dataset
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    The dataset comprises over 10,000 chat conversations, each focusing on specific BFSI-related topics. Each conversation provides a detailed interaction between a call center agent and a customer, capturing real-life scenarios and language nuances.

    Participants Details: 150+ native Swedish participants from the FutureBeeAI community.
    Word Count & Length: Chats are diverse, averaging 300 to 700 words and 50 to 150 turns across both speakers.

    Topic Diversity

    The chat dataset covers a wide range of conversations on BFSI topics, ensuring that the dataset is comprehensive and relevant for training and fine-tuning models for various BFSI use cases. It offers diversity in terms of conversation topics, chat types, and outcomes, including both inbound and outbound chats with positive, neutral, and negative outcomes.

    Inbound Chats:
    Account Opening
    Account Management
    Transactions
    Loan Inquiries & Applications
    Credit Card Services, and many more
    Outbound Chats:
    Product & Service Promotions
    Cross-selling & Upselling
    Customer Retention & Loyalty Programs
    Loan Application Follow-ups
    Insurance Policy Renewals/Reminders, and many more

    Language Variety & Nuances

    The conversations in this dataset capture the diverse language styles and expressions prevalent in Swedish BFSI interactions. This diversity ensures the dataset accurately represents the language used by Swedish speakers in BFSI contexts.

    The dataset encompasses a wide array of language elements, including:

    Naming Conventions: Chats include a variety of Swedish personal and business names.
    Localized Details: Real-world addresses, emails, phone numbers, and other contact information as according to different Swedish-speaking regions.
    Temporal and Numeric Expressions: Dates, times, currencies, and numbers in Swedish forms, adhering to local conventions.
    Idiomatic Expressions and Slang: It includes local slang, idioms, and informal phrase present in Swedish BFSI conversations.

    This linguistic authenticity ensures that the dataset equips researchers and developers with a comprehensive understanding of the intricate language patterns, cultural references, and communication styles inherent to Swedish BFSI interactions.

    Conversational Flow and Interaction Types

    The dataset includes a broad range of conversations, from simple inquiries to detailed discussions, capturing the dynamic nature of BFSI customer-agent interactions.

    Simple Inquiries
    Detailed Discussions
    Transactional Interactions
    Problem-Solving Dialogues
    Advisory Sessions
    Routine Checks and Follow-Ups

    Each of these conversations contains various aspects of conversation flow like:

    Greetings
    Authentication
    Information gathering
    Resolution identification
    Solution Delivery
    Closing and Follow-ups
    Feedback, etc

    This structured and varied conversational flow enables the creation of advanced NLP models that can effectively manage and respond to a wide range of customer service scenarios.

    Data Format and Structure

    <p

  16. F

    Hindi Conversation Chat Dataset for Retail & E-commerce Domain

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Hindi Conversation Chat Dataset for Retail & E-commerce Domain [Dataset]. https://www.futurebeeai.com/dataset/text-dataset/hindi-retail-domain-conversation-text-dataset
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    The dataset comprises over 12,000 chat conversations, each focusing on specific Retail & E-Commerce related topics. Each conversation provides a detailed interaction between a call center agent and a customer, capturing real-life scenarios and language nuances.

    Participants Details: 200+ native Hindi participants from the FutureBeeAI community.
    Word Count & Length: Chats are diverse, averaging 300 to 700 words and 50 to 150 turns across both speakers.

    Topic Diversity

    The chat dataset covers a wide range of conversations on Retail & E-Commerce topics, ensuring that the dataset is comprehensive and relevant for training and fine-tuning models for various Retail & E-Commerce use cases. It offers diversity in terms of conversation topics, chat types, and outcomes, including both inbound and outbound chats with positive, neutral, and negative outcomes.

    Inbound Chats:
    Product Inquiry
    Return/Exchange Request
    Order Cancellation
    Refund Request
    Membership/Subscriptions Enquiry
    Order Cancellations, and many more
    Outbound Chats:
    Order Confirmation
    Cross-selling and Upselling
    Account Updates
    Loyalty Program Offers
    Special Offers and Promotions
    Customer Verification, and many more

    Language Variety & Nuances

    The conversations in this dataset capture the diverse language styles and expressions prevalent in Hindi Retail & E-Commerce interactions. This diversity ensures the dataset accurately represents the language used by Hindi speakers in Retail & E-Commerce contexts.

    The dataset encompasses a wide array of language elements, including:

    Naming Conventions: Chats include a variety of Hindi personal and business names.
    Localized Details: Real-world addresses, emails, phone numbers, and other contact information as according to different Hindi-speaking regions.
    Temporal and Numeric Expressions: Dates, times, currencies, and numbers in Hindi forms, adhering to local conventions.
    Idiomatic Expressions and Slang: It includes local slang, idioms, and informal phrase present in Hindi Retail & E-Commerce conversations.

    This linguistic authenticity ensures that the dataset equips researchers and developers with a comprehensive understanding of the intricate language patterns, cultural references, and communication styles inherent to Hindi Retail & E-Commerce interactions.

    Conversational Flow and Interaction Types

    The dataset includes a broad range of conversations, from simple inquiries to detailed discussions, capturing the dynamic nature of Retail & E-Commerce customer-agent interactions.

    Simple Inquiries
    Detailed Discussions
    Transactional Interactions
    Problem-Solving Dialogues
    Advisory Sessions
    Routine Checks and Follow-Ups

    Each of these conversations contains various aspects of conversation flow like:

    Greetings
    Authentication
    Information gathering
    Resolution identification
    Solution Delivery
    Closing and Follow-ups
    <div style="margin-top:10px;

  17. f

    Data_Sheet_1_Development and testing of a multi-lingual Natural Language...

    • frontiersin.figshare.com
    docx
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lily Wei Yun Yang; Wei Yan Ng; Xiaofeng Lei; Shaun Chern Yuan Tan; Zhaoran Wang; Ming Yan; Mohan Kashyap Pargi; Xiaoman Zhang; Jane Sujuan Lim; Dinesh Visva Gunasekeran; Franklin Chee Ping Tan; Chen Ee Lee; Khung Keong Yeo; Hiang Khoon Tan; Henry Sun Sien Ho; Benedict Wee Bor Tan; Tien Yin Wong; Kenneth Yung Chiang Kwek; Rick Siow Mong Goh; Yong Liu; Daniel Shu Wei Ting (2023). Data_Sheet_1_Development and testing of a multi-lingual Natural Language Processing-based deep learning system in 10 languages for COVID-19 pandemic crisis: A multi-center study.docx [Dataset]. http://doi.org/10.3389/fpubh.2023.1063466.s001
    Explore at:
    docxAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Frontiers
    Authors
    Lily Wei Yun Yang; Wei Yan Ng; Xiaofeng Lei; Shaun Chern Yuan Tan; Zhaoran Wang; Ming Yan; Mohan Kashyap Pargi; Xiaoman Zhang; Jane Sujuan Lim; Dinesh Visva Gunasekeran; Franklin Chee Ping Tan; Chen Ee Lee; Khung Keong Yeo; Hiang Khoon Tan; Henry Sun Sien Ho; Benedict Wee Bor Tan; Tien Yin Wong; Kenneth Yung Chiang Kwek; Rick Siow Mong Goh; Yong Liu; Daniel Shu Wei Ting
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    PurposeThe COVID-19 pandemic has drastically disrupted global healthcare systems. With the higher demand for healthcare and misinformation related to COVID-19, there is a need to explore alternative models to improve communication. Artificial Intelligence (AI) and Natural Language Processing (NLP) have emerged as promising solutions to improve healthcare delivery. Chatbots could fill a pivotal role in the dissemination and easy accessibility of accurate information in a pandemic. In this study, we developed a multi-lingual NLP-based AI chatbot, DR-COVID, which responds accurately to open-ended, COVID-19 related questions. This was used to facilitate pandemic education and healthcare delivery.MethodsFirst, we developed DR-COVID with an ensemble NLP model on the Telegram platform (https://t.me/drcovid_nlp_chatbot). Second, we evaluated various performance metrics. Third, we evaluated multi-lingual text-to-text translation to Chinese, Malay, Tamil, Filipino, Thai, Japanese, French, Spanish, and Portuguese. We utilized 2,728 training questions and 821 test questions in English. Primary outcome measurements were (A) overall and top 3 accuracies; (B) Area Under the Curve (AUC), precision, recall, and F1 score. Overall accuracy referred to a correct response for the top answer, whereas top 3 accuracy referred to an appropriate response for any one answer amongst the top 3 answers. AUC and its relevant matrices were obtained from the Receiver Operation Characteristics (ROC) curve. Secondary outcomes were (A) multi-lingual accuracy; (B) comparison to enterprise-grade chatbot systems. The sharing of training and testing datasets on an open-source platform will also contribute to existing data.ResultsOur NLP model, utilizing the ensemble architecture, achieved overall and top 3 accuracies of 0.838 [95% confidence interval (CI): 0.826–0.851] and 0.922 [95% CI: 0.913–0.932] respectively. For overall and top 3 results, AUC scores of 0.917 [95% CI: 0.911–0.925] and 0.960 [95% CI: 0.955–0.964] were achieved respectively. We achieved multi-linguicism with nine non-English languages, with Portuguese performing the best overall at 0.900. Lastly, DR-COVID generated answers more accurately and quickly than other chatbots, within 1.12–2.15 s across three devices tested.ConclusionDR-COVID is a clinically effective NLP-based conversational AI chatbot, and a promising solution for healthcare delivery in the pandemic era.

  18. h

    Bitext-retail-ecommerce-llm-chatbot-training-dataset

    • huggingface.co
    Updated Aug 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bitext (2024). Bitext-retail-ecommerce-llm-chatbot-training-dataset [Dataset]. https://huggingface.co/datasets/bitext/Bitext-retail-ecommerce-llm-chatbot-training-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 6, 2024
    Dataset authored and provided by
    Bitext
    License

    https://choosealicense.com/licenses/cdla-sharing-1.0/https://choosealicense.com/licenses/cdla-sharing-1.0/

    Description

    Bitext - Retail (eCommerce) Tagged Training Dataset for LLM-based Virtual Assistants

      Overview
    

    This hybrid synthetic dataset is designed to be used to fine-tune Large Language Models such as GPT, Mistral and OpenELM, and has been generated using our NLP/NLG technology and our automated Data Labeling (DAL) tools. The goal is to demonstrate how Verticalization/Domain Adaptation for the [Retail (eCommerce)] sector can be easily achieved using our two-step approach to LLM… See the full description on the dataset page: https://huggingface.co/datasets/bitext/Bitext-retail-ecommerce-llm-chatbot-training-dataset.

  19. LLM Influence on Medical Diagnostic Reasoning

    • kaggle.com
    Updated Dec 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Patrick L Ford (2024). LLM Influence on Medical Diagnostic Reasoning [Dataset]. http://doi.org/10.34740/kaggle/dsv/10119916
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 6, 2024
    Dataset provided by
    Kaggle
    Authors
    Patrick L Ford
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Introduction

    A new study published in JAMA Network Open revealed that ChatGPT-4 outperformed doctors in diagnosing medical conditions from case reports. The AI chatbot scored an average of 92% in the study, while doctors using the chatbot scored 76% and those without it scored 74%.

    The study involved 50 doctors (26 attending, 24 residents; median years in practice, 3 [IQR, 2-8]) who were given six case histories and graded on their ability to suggest diagnoses and explain their reasoning. The results showed that doctors often stuck to their initial diagnoses even when the chatbot suggested a better one, highlighting an overconfidence bias. Additionally, many doctors didn't fully utilise the chatbot's capabilities, treating it like a search engine instead of leveraging its ability to analyse full case histories.

    The study raises questions about how doctors think and how AI tools can be best integrated into medical practice. While AI has the potential to be a "doctor extender," providing valuable second opinions, the study suggests that more training and a shift in mindset may be needed for doctors to fully embrace and benefit from these advancements. link

    Study Findings

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13231939%2F4e4c6a4ce9f191ab32e660c726c5204f%2FScreenshot%202024-12-05%2013.33.30.png?generation=1733490846716451&alt=media" alt="">

    Visualisation

    The study compares the diagnostic reasoning performance of physicians using a commercial LLM AI chatbot (ChatGPT Plus [GPT-4]: OpenAl) compared with conventional diagnostic resources (eg, UpToDate, Google): - ***Conventional Resources*-Only Group (Doctor on Own):** This group refers to doctors using only conventional resources (likely standard medical tools and knowledge) without the assistance of an LLM (large language model). - Doctor With LLM Group: This group involves doctors using conventional resources along with an LLM, which could be a tool or AI assistant helping with diagnostic reasoning. - ***LLM Alone* Group:** This group refers to the use of the LLM on its own, without any conventional resources or doctor intervention.

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13231939%2F7360932a01d641b6adc3594b2e5cae11%2FScreenshot%202024-12-06%2012.11.05.png?generation=1733490890087478&alt=media" alt="">

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13231939%2F7e14a7c648febf04ac657f8dc51ea796%2FScreenshot%202024-12-06%2012.11.58.png?generation=1733490908679868&alt=media" alt="">

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13231939%2F9b9d165a7c69b1a5624186b7904c46c0%2FScreenshot%202024-12-06%2012.12.41.png?generation=1733490932343833&alt=media" alt="">

    A Markdown document with the R code for the above plots. link

    Conclusion

    This study reveals a fascinating and potentially transformative dynamic between artificial intelligence and human medical expertise. While ChatGPT-4 demonstrated remarkable diagnostic accuracy, surpassing even experienced physicians, the study also highlighted critical challenges in integrating AI into clinical practice.

    The findings suggest that: - AI can significantly enhance diagnostic accuracy: LLMs like ChatGPT-4 have the potential to revolutionise how medical diagnoses are made, offering a level of accuracy exceeding current practices. - Human factors remain crucial: Overconfidence bias and under-utilisation of AI tools by physicians underscore the need for training and a shift in mindset to effectively leverage these advancements. Doctors must learn to collaborate with AI, viewing it as a powerful partner rather than a simple search engine. - Further research is needed: This study provides a crucial starting point for further investigation into the optimal integration of AI into healthcare. Future research should explore: - Effective training methods for physicians to utilise AI tools. - The impact of AI assistance on patient outcomes. - Ethical considerations surrounding the use of AI in medicine. - The potential for AI to address healthcare disparities.

    Ultimately, the successful integration of AI into healthcare will depend not only on technological advancements but also on a willingness among medical professionals to embrace new ways of thinking and working. By harnessing the power of AI while recognising the essential role of human expertise, we can strive towards a future where medical care is more accurate, efficient, and accessible for all.

    Patrick Ford 🥼🩺🖥

  20. h

    lmsys-chat-1m

    • huggingface.co
    • opendatalab.com
    Updated Sep 17, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Large Model Systems Organization (2023). lmsys-chat-1m [Dataset]. https://huggingface.co/datasets/lmsys/lmsys-chat-1m
    Explore at:
    Dataset updated
    Sep 17, 2023
    Dataset authored and provided by
    Large Model Systems Organization
    Description

    LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset

    This dataset contains one million real-world conversations with 25 state-of-the-art LLMs. It is collected from 210K unique IP addresses in the wild on the Vicuna demo and Chatbot Arena website from April to August 2023. Each sample includes a conversation ID, model name, conversation text in OpenAI API JSON format, detected language tag, and OpenAI moderation API tag. User consent is obtained through the "Terms of use"… See the full description on the dataset page: https://huggingface.co/datasets/lmsys/lmsys-chat-1m.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Ruslan Magana Vsevolodovna (2024). ai-medical-chatbot [Dataset]. https://huggingface.co/datasets/ruslanmv/ai-medical-chatbot

ai-medical-chatbot

ruslanmv/ai-medical-chatbot

Explore at:
3 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 16, 2024
Authors
Ruslan Magana Vsevolodovna
Description

AI Medical Chatbot Dataset

This is an experimental Dataset designed to run a Medical Chatbot It contains at least 250k dialogues between a Patient and a Doctor.

  Playground ChatBot

ruslanmv/AI-Medical-Chatbot For furter information visit the project here: https://github.com/ruslanmv/ai-medical-chatbot

Search
Clear search
Close search
Google apps
Main menu