Saved datasets
Last updated
Download format
Croissant
Croissant is a format for Machine Learning datasets
Learn more about this at mlcommons.org/croissant.
Usage rights
License from data provider
Please review the applicable license to make sure your contemplated use is permitted.
Topic
Provider
Free
Cost to access
Described as free to access or have a license that allows redistribution.
51 datasets found
  1. F

    Hindi Conversation Chat Dataset for Telecom Domain

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
  2. F

    Hindi Conversation Chat Dataset for Healthcare Domain

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
  3. s

    Hindi Language Datasets | Audio Data for ASR, Virtual Assistant

    • fr.shaip.com
    • pl.shaip.com
    • +41more
    Updated Jul 17, 2023
  4. F

    Hindi (India) General Conversation Speech Dataset

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
  5. n

    797 Hours - Hindi(India) Spontaneous Dialogue Smartphone speech dataset

    • m.nexdata.ai
    Updated Nov 19, 2023
    + more versions
  6. n

    34 Hours - Hindi(India) Children Real-world Casual Conversation and...

    • m.nexdata.ai
    • nexdata.ai
    Updated Mar 20, 2024
  7. F

    General domain Human-Human conversation chats in Hindi

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
  8. h

    Cross-Hindi-Hinglish-chat

    • huggingface.co
    Updated Mar 20, 2024
  9. h

    gooftagoo

    • huggingface.co
    Updated Mar 17, 2024
    + more versions
  10. F

    Hindi Conversation Chat Dataset for Delivery & Logistics Domain

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
  11. Hindi-English TED talks, Wikipedia articles, etc.

    • kaggle.com
    zip
    Updated Oct 31, 2020
  12. n

    494 Hours - Hindi(India) Real-world Casual Conversation and Monologue speech...

    • nexdata.ai
    Updated Nov 11, 2023
    + more versions
  13. F

    Hindi Conversation Chat Dataset for Real Estate Domain

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
  14. 760 Hours - Hindi(India) Spontaneous Dialogue Telephony speech dataset

    • nexdata.ai
    • m.nexdata.ai
    Updated May 31, 2023
  15. h

    indic-instruct-data-v0.1

    • huggingface.co
    Updated Jan 25, 2024
  16. h

    hind_encorp

    • huggingface.co
    • paperswithcode.com
    • +3more
    Updated Mar 22, 2014
    + more versions
  17. m

    IndicDialogue Dataset

    • data.mendeley.com
    Updated Jun 11, 2024
  18. F

    Hindi Conversation Chat Dataset for Travel Domain

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
  19. s

    Hindi-English Off-the-Shelf Datasets

    • bn.shaip.com
    • no.shaip.com
    • +3more
    json
    Updated Jan 10, 2023
  20. h

    cmu_hinglish_dog

    • huggingface.co
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
FutureBee AI (2022). Hindi Conversation Chat Dataset for Telecom Domain [Dataset]. https://www.futurebeeai.com/dataset/text-dataset/hindi-telecom-domain-conversation-text-dataset

Hindi Conversation Chat Dataset for Telecom Domain

Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License

https://www.futurebeeai.com/data-license-agreementhttps://www.futurebeeai.com/data-license-agreement

Dataset funded by
FutureBeeAI
Description

Introduction

The dataset comprises over 12,000 chat conversations, each focusing on specific Telecom related topics. Each conversation provides a detailed interaction between a call center agent and a customer, capturing real-life scenarios and language nuances.

Participants Details: 200+ native Hindi participants from the FutureBeeAI community.
Word Count & Length: Chats are diverse, averaging 300 to 700 words and 50 to 150 turns across both speakers.

Topic Diversity

The chat dataset covers a wide range of conversations on Telecom topics, ensuring that the dataset is comprehensive and relevant for training and fine-tuning models for various Telecom use cases. It offers diversity in terms of conversation topics, chat types, and outcomes, including both inbound and outbound chats with positive, neutral, and negative outcomes.

Inbound Chats:
Phone Number Porting
Network Connectivity Issues
Billing and Payments
Technical Support
Service Activation
International Roaming Enquiry
Refunds and Billing Adjustments
Emergency Service Access, and many more
Outbound Chats:
Welcome Calls / Onboarding Process
Payment Reminders
Customer Surveys
Technical Updates
Service Usage Reviews
Network Complaint Update, and many more

Language Variety & Nuances

The conversations in this dataset capture the diverse language styles and expressions prevalent in Hindi Telecom interactions. This diversity ensures the dataset accurately represents the language used by Hindi speakers in Telecom contexts.

The dataset encompasses a wide array of language elements, including:

Naming Conventions: Chats include a variety of Hindi personal and business names.
Localized Details: Real-world addresses, emails, phone numbers, and other contact information as according to different Hindi-speaking regions.
Temporal and Numeric Expressions: Dates, times, currencies, and numbers in Hindi forms, adhering to local conventions.
Idiomatic Expressions and Slang: It includes local slang, idioms, and informal phrase present in Hindi Telecom conversations.

This linguistic authenticity ensures that the dataset equips researchers and developers with a comprehensive understanding of the intricate language patterns, cultural references, and communication styles inherent to Hindi Telecom interactions.

Conversational Flow and Interaction Types

The dataset includes a broad range of conversations, from simple inquiries to detailed discussions, capturing the dynamic nature of Telecom customer-agent interactions.

Simple Inquiries
Detailed Discussions
Transactional Interactions
Problem-Solving Dialogues
Advisory Sessions
Routine Checks and Follow-Ups

Each of these conversations contains various aspects of conversation flow like:

Greetings
Authentication
Information gathering
Resolution identification
<span

Search
Clear search
Close search
Google apps
Main menu