5 datasets found
  1. F

    American English Call Center Data for Telecom AI

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). American English Call Center Data for Telecom AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/telecom-call-center-conversation-english-usa
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Area covered
    United States
    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    This US English Call Center Speech Dataset for the Telecom industry is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for English-speaking telecom customers. Featuring over 30 hours of real-world, unscripted audio, it delivers authentic customer-agent interactions across key telecom support scenarios to help train robust ASR models.

    Curated by FutureBeeAI, this dataset empowers voice AI engineers, telecom automation teams, and NLP researchers to build high-accuracy, production-ready models for telecom-specific use cases.

    Speech Data

    The dataset contains 30 hours of dual-channel call center recordings between native US English speakers. Captured in realistic customer support settings, these conversations span a wide range of telecom topics from network complaints to billing issues, offering a strong foundation for training and evaluating telecom voice AI solutions.

    Participant Diversity:
    Speakers: 60 native US English speakers from our verified contributor pool.
    Regions: Representing multiple provinces across United States of America to ensure coverage of various accents and dialects.
    Participant Profile: Balanced gender mix (60% male, 40% female) with age distribution from 18 to 70 years.
    Recording Details:
    Conversation Nature: Naturally flowing, unscripted interactions between agents and customers.
    Call Duration: Ranges from 5 to 15 minutes.
    Audio Format: Stereo WAV files, 16-bit depth, at 8kHz and 16kHz sample rates.
    Recording Environment: Captured in clean conditions with no echo or background noise.

    Topic Diversity

    This speech corpus includes both inbound and outbound calls with varied conversational outcomes like positive, negative, and neutral ensuring broad scenario coverage for telecom AI development.

    Inbound Calls:
    Phone Number Porting
    Network Connectivity Issues
    Billing and Payments
    Technical Support
    Service Activation
    International Roaming Enquiry
    Refund Requests and Billing Adjustments
    Emergency Service Access, and others
    Outbound Calls:
    Welcome Calls & Onboarding
    Payment Reminders
    Customer Satisfaction Surveys
    Technical Updates
    Service Usage Reviews
    Network Complaint Status Calls, and more

    This variety helps train telecom-specific models to manage real-world customer interactions and understand context-specific voice patterns.

    Transcription

    All audio files are accompanied by manually curated, time-coded verbatim transcriptions in JSON format.

    Transcription Includes:
    Speaker-Segmented Dialogues
    Time-coded Segments
    Non-speech Tags (e.g., pauses, coughs)
    High transcription accuracy with word error rate < 5% thanks to dual-layered quality checks.

    These transcriptions are production-ready, allowing for faster development of ASR and conversational AI systems in the Telecom domain.

    Metadata

    Rich metadata is available for each participant and conversation:

    Participant Metadata: ID, age, gender, accent, dialect, and location.
    <div style="margin-top:10px; margin-bottom: 10px; padding-left: 30px; display: flex; gap: 16px;

  2. F

    Canadian English Call Center Data for Telecom AI

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Canadian English Call Center Data for Telecom AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/telecom-call-center-conversation-english-canada
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Area covered
    Canada
    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    This Canadian English Call Center Speech Dataset for the Telecom industry is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for English-speaking telecom customers. Featuring over 30 hours of real-world, unscripted audio, it delivers authentic customer-agent interactions across key telecom support scenarios to help train robust ASR models.

    Curated by FutureBeeAI, this dataset empowers voice AI engineers, telecom automation teams, and NLP researchers to build high-accuracy, production-ready models for telecom-specific use cases.

    Speech Data

    The dataset contains 30 hours of dual-channel call center recordings between native Canadian English speakers. Captured in realistic customer support settings, these conversations span a wide range of telecom topics from network complaints to billing issues, offering a strong foundation for training and evaluating telecom voice AI solutions.

    Participant Diversity:
    Speakers: 60 native Canadian English speakers from our verified contributor pool.
    Regions: Representing multiple provinces across Canada to ensure coverage of various accents and dialects.
    Participant Profile: Balanced gender mix (60% male, 40% female) with age distribution from 18 to 70 years.
    Recording Details:
    Conversation Nature: Naturally flowing, unscripted interactions between agents and customers.
    Call Duration: Ranges from 5 to 15 minutes.
    Audio Format: Stereo WAV files, 16-bit depth, at 8kHz and 16kHz sample rates.
    Recording Environment: Captured in clean conditions with no echo or background noise.

    Topic Diversity

    This speech corpus includes both inbound and outbound calls with varied conversational outcomes like positive, negative, and neutral ensuring broad scenario coverage for telecom AI development.

    Inbound Calls:
    Phone Number Porting
    Network Connectivity Issues
    Billing and Payments
    Technical Support
    Service Activation
    International Roaming Enquiry
    Refund Requests and Billing Adjustments
    Emergency Service Access, and others
    Outbound Calls:
    Welcome Calls & Onboarding
    Payment Reminders
    Customer Satisfaction Surveys
    Technical Updates
    Service Usage Reviews
    Network Complaint Status Calls, and more

    This variety helps train telecom-specific models to manage real-world customer interactions and understand context-specific voice patterns.

    Transcription

    All audio files are accompanied by manually curated, time-coded verbatim transcriptions in JSON format.

    Transcription Includes:
    Speaker-Segmented Dialogues
    Time-coded Segments
    Non-speech Tags (e.g., pauses, coughs)
    High transcription accuracy with word error rate < 5% thanks to dual-layered quality checks.

    These transcriptions are production-ready, allowing for faster development of ASR and conversational AI systems in the Telecom domain.

    Metadata

    Rich metadata is available for each participant and conversation:

    Participant Metadata: ID, age, gender, accent, dialect, and location.
    <div style="margin-top:10px; margin-bottom: 10px; padding-left: 30px; display: flex; gap: 16px;

  3. F

    Hindi Call Center Data for Telecom AI

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Hindi Call Center Data for Telecom AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/telecom-call-center-conversation-hindi-india
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    This Hindi Call Center Speech Dataset for the Telecom industry is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for Hindi-speaking telecom customers. Featuring over 30 hours of real-world, unscripted audio, it delivers authentic customer-agent interactions across key telecom support scenarios to help train robust ASR models.

    Curated by FutureBeeAI, this dataset empowers voice AI engineers, telecom automation teams, and NLP researchers to build high-accuracy, production-ready models for telecom-specific use cases.

    Speech Data

    The dataset contains 30 hours of dual-channel call center recordings between native Hindi speakers. Captured in realistic customer support settings, these conversations span a wide range of telecom topics from network complaints to billing issues, offering a strong foundation for training and evaluating telecom voice AI solutions.

    Participant Diversity:
    Speakers: 60 native Hindi speakers from our verified contributor pool.
    Regions: Representing multiple provinces across India to ensure coverage of various accents and dialects.
    Participant Profile: Balanced gender mix (60% male, 40% female) with age distribution from 18 to 70 years.
    Recording Details:
    Conversation Nature: Naturally flowing, unscripted interactions between agents and customers.
    Call Duration: Ranges from 5 to 15 minutes.
    Audio Format: Stereo WAV files, 16-bit depth, at 8kHz and 16kHz sample rates.
    Recording Environment: Captured in clean conditions with no echo or background noise.

    Topic Diversity

    This speech corpus includes both inbound and outbound calls with varied conversational outcomes like positive, negative, and neutral ensuring broad scenario coverage for telecom AI development.

    Inbound Calls:
    Phone Number Porting
    Network Connectivity Issues
    Billing and Payments
    Technical Support
    Service Activation
    International Roaming Enquiry
    Refund Requests and Billing Adjustments
    Emergency Service Access, and others
    Outbound Calls:
    Welcome Calls & Onboarding
    Payment Reminders
    Customer Satisfaction Surveys
    Technical Updates
    Service Usage Reviews
    Network Complaint Status Calls, and more

    This variety helps train telecom-specific models to manage real-world customer interactions and understand context-specific voice patterns.

    Transcription

    All audio files are accompanied by manually curated, time-coded verbatim transcriptions in JSON format.

    Transcription Includes:
    Speaker-Segmented Dialogues
    Time-coded Segments
    Non-speech Tags (e.g., pauses, coughs)
    High transcription accuracy with word error rate < 5% thanks to dual-layered quality checks.

    These transcriptions are production-ready, allowing for faster development of ASR and conversational AI systems in the Telecom domain.

    Metadata

    Rich metadata is available for each participant and conversation:

    Participant Metadata: ID, age, gender, accent, dialect, and location.

  4. F

    British English Call Center Data for Telecom AI

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). British English Call Center Data for Telecom AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/telecom-call-center-conversation-english-uk
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Area covered
    United Kingdom
    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    This UK English Call Center Speech Dataset for the Telecom industry is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for English-speaking telecom customers. Featuring over 30 hours of real-world, unscripted audio, it delivers authentic customer-agent interactions across key telecom support scenarios to help train robust ASR models.

    Curated by FutureBeeAI, this dataset empowers voice AI engineers, telecom automation teams, and NLP researchers to build high-accuracy, production-ready models for telecom-specific use cases.

    Speech Data

    The dataset contains 30 hours of dual-channel call center recordings between native UK English speakers. Captured in realistic customer support settings, these conversations span a wide range of telecom topics from network complaints to billing issues, offering a strong foundation for training and evaluating telecom voice AI solutions.

    Participant Diversity:
    Speakers: 60 native UK English speakers from our verified contributor pool.
    Regions: Representing multiple provinces across United Kingdom to ensure coverage of various accents and dialects.
    Participant Profile: Balanced gender mix (60% male, 40% female) with age distribution from 18 to 70 years.
    Recording Details:
    Conversation Nature: Naturally flowing, unscripted interactions between agents and customers.
    Call Duration: Ranges from 5 to 15 minutes.
    Audio Format: Stereo WAV files, 16-bit depth, at 8kHz and 16kHz sample rates.
    Recording Environment: Captured in clean conditions with no echo or background noise.

    Topic Diversity

    This speech corpus includes both inbound and outbound calls with varied conversational outcomes like positive, negative, and neutral ensuring broad scenario coverage for telecom AI development.

    Inbound Calls:
    Phone Number Porting
    Network Connectivity Issues
    Billing and Payments
    Technical Support
    Service Activation
    International Roaming Enquiry
    Refund Requests and Billing Adjustments
    Emergency Service Access, and others
    Outbound Calls:
    Welcome Calls & Onboarding
    Payment Reminders
    Customer Satisfaction Surveys
    Technical Updates
    Service Usage Reviews
    Network Complaint Status Calls, and more

    This variety helps train telecom-specific models to manage real-world customer interactions and understand context-specific voice patterns.

    Transcription

    All audio files are accompanied by manually curated, time-coded verbatim transcriptions in JSON format.

    Transcription Includes:
    Speaker-Segmented Dialogues
    Time-coded Segments
    Non-speech Tags (e.g., pauses, coughs)
    High transcription accuracy with word error rate < 5% thanks to dual-layered quality checks.

    These transcriptions are production-ready, allowing for faster development of ASR and conversational AI systems in the Telecom domain.

    Metadata

    Rich metadata is available for each participant and conversation:

    Participant Metadata: ID, age, gender, accent, dialect, and location.
    <div style="margin-top:10px; margin-bottom: 10px; padding-left: 30px; display: flex; gap: 16px; align-items:

  5. F

    Indian English Call Center Data for Telecom AI

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Indian English Call Center Data for Telecom AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/telecom-call-center-conversation-english-india
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    This Indian English Call Center Speech Dataset for the Telecom industry is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for English-speaking telecom customers. Featuring over 30 hours of real-world, unscripted audio, it delivers authentic customer-agent interactions across key telecom support scenarios to help train robust ASR models.

    Curated by FutureBeeAI, this dataset empowers voice AI engineers, telecom automation teams, and NLP researchers to build high-accuracy, production-ready models for telecom-specific use cases.

    Speech Data

    The dataset contains 30 hours of dual-channel call center recordings between native Indian English speakers. Captured in realistic customer support settings, these conversations span a wide range of telecom topics from network complaints to billing issues, offering a strong foundation for training and evaluating telecom voice AI solutions.

    Participant Diversity:
    Speakers: 60 native Indian English speakers from our verified contributor pool.
    Regions: Representing multiple provinces across India to ensure coverage of various accents and dialects.
    Participant Profile: Balanced gender mix (60% male, 40% female) with age distribution from 18 to 70 years.
    Recording Details:
    Conversation Nature: Naturally flowing, unscripted interactions between agents and customers.
    Call Duration: Ranges from 5 to 15 minutes.
    Audio Format: Stereo WAV files, 16-bit depth, at 8kHz and 16kHz sample rates.
    Recording Environment: Captured in clean conditions with no echo or background noise.

    Topic Diversity

    This speech corpus includes both inbound and outbound calls with varied conversational outcomes like positive, negative, and neutral ensuring broad scenario coverage for telecom AI development.

    Inbound Calls:
    Phone Number Porting
    Network Connectivity Issues
    Billing and Payments
    Technical Support
    Service Activation
    International Roaming Enquiry
    Refund Requests and Billing Adjustments
    Emergency Service Access, and others
    Outbound Calls:
    Welcome Calls & Onboarding
    Payment Reminders
    Customer Satisfaction Surveys
    Technical Updates
    Service Usage Reviews
    Network Complaint Status Calls, and more

    This variety helps train telecom-specific models to manage real-world customer interactions and understand context-specific voice patterns.

    Transcription

    All audio files are accompanied by manually curated, time-coded verbatim transcriptions in JSON format.

    Transcription Includes:
    Speaker-Segmented Dialogues
    Time-coded Segments
    Non-speech Tags (e.g., pauses, coughs)
    High transcription accuracy with word error rate < 5% thanks to dual-layered quality checks.

    These transcriptions are production-ready, allowing for faster development of ASR and conversational AI systems in the Telecom domain.

    Metadata

    Rich metadata is available for each participant and conversation:

    Participant Metadata: ID, age, gender, accent, dialect, and location.
    <div style="margin-top:10px; margin-bottom: 10px; padding-left: 30px; display: flex; gap: 16px;

  6. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
FutureBee AI (2022). American English Call Center Data for Telecom AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/telecom-call-center-conversation-english-usa

American English Call Center Data for Telecom AI

American English call center speech corpus in telecom industry

Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License

https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

Area covered
United States
Dataset funded by
FutureBeeAI
Description

Introduction

This US English Call Center Speech Dataset for the Telecom industry is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for English-speaking telecom customers. Featuring over 30 hours of real-world, unscripted audio, it delivers authentic customer-agent interactions across key telecom support scenarios to help train robust ASR models.

Curated by FutureBeeAI, this dataset empowers voice AI engineers, telecom automation teams, and NLP researchers to build high-accuracy, production-ready models for telecom-specific use cases.

Speech Data

The dataset contains 30 hours of dual-channel call center recordings between native US English speakers. Captured in realistic customer support settings, these conversations span a wide range of telecom topics from network complaints to billing issues, offering a strong foundation for training and evaluating telecom voice AI solutions.

Participant Diversity:
Speakers: 60 native US English speakers from our verified contributor pool.
Regions: Representing multiple provinces across United States of America to ensure coverage of various accents and dialects.
Participant Profile: Balanced gender mix (60% male, 40% female) with age distribution from 18 to 70 years.
Recording Details:
Conversation Nature: Naturally flowing, unscripted interactions between agents and customers.
Call Duration: Ranges from 5 to 15 minutes.
Audio Format: Stereo WAV files, 16-bit depth, at 8kHz and 16kHz sample rates.
Recording Environment: Captured in clean conditions with no echo or background noise.

Topic Diversity

This speech corpus includes both inbound and outbound calls with varied conversational outcomes like positive, negative, and neutral ensuring broad scenario coverage for telecom AI development.

Inbound Calls:
Phone Number Porting
Network Connectivity Issues
Billing and Payments
Technical Support
Service Activation
International Roaming Enquiry
Refund Requests and Billing Adjustments
Emergency Service Access, and others
Outbound Calls:
Welcome Calls & Onboarding
Payment Reminders
Customer Satisfaction Surveys
Technical Updates
Service Usage Reviews
Network Complaint Status Calls, and more

This variety helps train telecom-specific models to manage real-world customer interactions and understand context-specific voice patterns.

Transcription

All audio files are accompanied by manually curated, time-coded verbatim transcriptions in JSON format.

Transcription Includes:
Speaker-Segmented Dialogues
Time-coded Segments
Non-speech Tags (e.g., pauses, coughs)
High transcription accuracy with word error rate < 5% thanks to dual-layered quality checks.

These transcriptions are production-ready, allowing for faster development of ASR and conversational AI systems in the Telecom domain.

Metadata

Rich metadata is available for each participant and conversation:

Participant Metadata: ID, age, gender, accent, dialect, and location.
<div style="margin-top:10px; margin-bottom: 10px; padding-left: 30px; display: flex; gap: 16px;

Search
Clear search
Close search
Google apps
Main menu