69 datasets found
  1. F

    Japanese General Conversation Speech Dataset for ASR

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Japanese General Conversation Speech Dataset for ASR [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/general-conversation-japanese-japan
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the Japanese General Conversation Speech Dataset — a rich, linguistically diverse corpus purpose-built to accelerate the development of Japanese speech technologies. This dataset is designed to train and fine-tune ASR systems, spoken language understanding models, and generative voice AI tailored to real-world Japanese communication.

    Curated by FutureBeeAI, this 40 hours dataset offers unscripted, spontaneous two-speaker conversations across a wide array of real-life topics. It enables researchers, AI developers, and voice-first product teams to build robust, production-grade Japanese speech models that understand and respond to authentic Japanese accents and dialects.

    Speech Data

    The dataset comprises 40 hours of high-quality audio, featuring natural, free-flowing dialogue between native speakers of Japanese. These sessions range from informal daily talks to deeper, topic-specific discussions, ensuring variability and context richness for diverse use cases.

    Participant Diversity:
    Speakers: 80 verified native Japanese speakers from FutureBeeAI’s contributor community.
    Regions: Representing various provinces of Japan to ensure dialectal diversity and demographic balance.
    Demographics: A balanced gender ratio (60% male, 40% female) with participant ages ranging from 18 to 70 years.
    Recording Details:
    Conversation Style: Unscripted, spontaneous peer-to-peer dialogues.
    Duration: Each conversation ranges from 15 to 60 minutes.
    Audio Format: Stereo WAV files, 16-bit depth, recorded at 16kHz sample rate.
    Environment: Quiet, echo-free settings with no background noise.

    Topic Diversity

    The dataset spans a wide variety of everyday and domain-relevant themes. This topic diversity ensures the resulting models are adaptable to broad speech contexts.

    Sample Topics Include:
    Family & Relationships
    Food & Recipes
    Education & Career
    Healthcare Discussions
    Social Issues
    Technology & Gadgets
    Travel & Local Culture
    Shopping & Marketplace Experiences, and many more.

    Transcription

    Each audio file is paired with a human-verified, verbatim transcription available in JSON format.

    Transcription Highlights:
    Speaker-segmented dialogues
    Time-coded utterances
    Non-speech elements (pauses, laughter, etc.)
    High transcription accuracy, achieved through double QA pass, average WER < 5%

    These transcriptions are production-ready, enabling seamless integration into ASR model pipelines or conversational AI workflows.

    Metadata

    The dataset comes with granular metadata for both speakers and recordings:

    Speaker Metadata: Age, gender, accent, dialect, state/province, and participant ID.
    Recording Metadata: Topic, duration, audio format, device type, and sample rate.

    Such metadata helps developers fine-tune model training and supports use-case-specific filtering or demographic analysis.

    Usage and Applications

    This dataset is a versatile resource for multiple Japanese speech and language AI applications:

    ASR Development: Train accurate speech-to-text systems for Japanese.
    Voice Assistants: Build smart assistants capable of understanding natural Japanese conversations.

  2. 633 Hours - Japanese Conversational Speech by Mobile Phone

    • m.nexdata.ai
    • nexdata.ai
    Updated Feb 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2024). 633 Hours - Japanese Conversational Speech by Mobile Phone [Dataset]. https://m.nexdata.ai/datasets/speechrecog/1166?source=Github
    Explore at:
    Dataset updated
    Feb 11, 2024
    Dataset authored and provided by
    Nexdata
    Variables measured
    Format, Country, Speaker, Language, Annotation, Accuracy rate, Recording device, Recording Content, Language(Region) Code, Recording Environment
    Description

    Japanese(Japan) Spontaneous Dialogue Smartphone speech dataset, collected from dialogues based on given topics. Transcribed with text content, timestamp, speaker's ID, gender and other attributes. Our dataset was collected from extensive and diversify speakers(around 1000 native speakers), geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

  3. h

    japanese-photos-conversation

    • huggingface.co
    Updated Nov 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    LLM-jp (2024). japanese-photos-conversation [Dataset]. https://huggingface.co/datasets/llm-jp/japanese-photos-conversation
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 20, 2024
    Dataset authored and provided by
    LLM-jp
    Description

    Dataset Card for japanese photos conversation

      Dataset details
    

    This dataset contains multi-turn conversational instructions about images taken in Japan. The images were sourced from https://huggingface.co/datasets/ThePioneer/japanese-photos. We input each image into GPT-4o (gpt-4o-2024-05-13) via the Azure OpenAI API to generate the instruction data. Some of the images in the original dataset were filtered by the Azure OpenAI API when they were input, resulting in a total… See the full description on the dataset page: https://huggingface.co/datasets/llm-jp/japanese-photos-conversation.

  4. Z

    Data from: JPS-daprinfo: A Dataset for Japanese Dialog Act Analysis and...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Mar 10, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hiroshi Ishiguro (2021). JPS-daprinfo: A Dataset for Japanese Dialog Act Analysis and People-related Information Detection [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_4590252
    Explore at:
    Dataset updated
    Mar 10, 2021
    Dataset provided by
    Carlos Toshinori Ishi
    Chaoran Liu
    Changzeng Fu
    Hiroshi Ishiguro
    Description

    We conducted a labeling work on a spoken Japanese dataset (I-JAS) for the text classification, which contains 50 interview dialogues of two-way Japanese conversation that discuss the participants' past present and future. Each dialogue is 30 minutes long. From this dataset, we selected the interview dialogues of native Japanese speakers as the samples. Given the dataset, we annotated sentences with 13 labels. The labeling work was conducted by native Japanese speakers who have experience with data annotation.

    labels:

    ssi/osi:subjective information soi/ooi:objective information op/sp:plan qu:question ap:apology th:thanking cc:topic changing/closing ag:agreement ds:disagreement re:request pr:proposal su:summarize/reformulate th:other

  5. h

    japanese-speech-recognition-dataset

    • huggingface.co
    Updated Aug 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Unidata NLP (2025). japanese-speech-recognition-dataset [Dataset]. https://huggingface.co/datasets/ud-nlp/japanese-speech-recognition-dataset
    Explore at:
    Dataset updated
    Aug 2, 2025
    Authors
    Unidata NLP
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Japanese Telephone Dialogues Dataset - 513 Hours

    Dataset comprises 513 hours of high-quality telephone audio recordings in Japanese, featuring 800+ native speakers and achieving a 95% sentence accuracy rate. Designed for advancing speech recognition models and language processing, this extensive speech data corpus covers diverse topics and domains, making it ideal for training robust automatic speech recognition (ASR) systems. - Get the data

      Dataset characteristics:… See the full description on the dataset page: https://huggingface.co/datasets/ud-nlp/japanese-speech-recognition-dataset.
    
  6. 96 Hours - Japanese(Japan) Children Real-world Casual Conversation and...

    • nexdata.ai
    Updated Feb 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2024). 96 Hours - Japanese(Japan) Children Real-world Casual Conversation and Monologue speech dataset [Dataset]. https://www.nexdata.ai/datasets/speechrecog/1328
    Explore at:
    Dataset updated
    Feb 29, 2024
    Dataset authored and provided by
    Nexdata
    Area covered
    World, Japan
    Variables measured
    Age, Format, Country, Accuracy, Language, Content category, Language(Region) Code, Recording environment, Features of annotation
    Description

    Japanese(Japan) Children Real-world Casual Conversation and Monologue speech dataset, covers self-media, conversation, live, lecture, variety show and other generic domains, mirrors real-world interactions. Transcribed with text content, speaker's ID, gender, age, accent and other attributes. Our dataset was collected from extensive and diversify speakers(12 years old and younger children), geographicly speaking, enhancing model performance in real and complex tasks.Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

  7. F

    Japanese Call Center Data for Telecom AI

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Japanese Call Center Data for Telecom AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/telecom-call-center-conversation-japanese-japan
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    This Japanese Call Center Speech Dataset for the Telecom industry is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for Japanese-speaking telecom customers. Featuring over 40 hours of real-world, unscripted audio, it delivers authentic customer-agent interactions across key telecom support scenarios to help train robust ASR models.

    Curated by FutureBeeAI, this dataset empowers voice AI engineers, telecom automation teams, and NLP researchers to build high-accuracy, production-ready models for telecom-specific use cases.

    Speech Data

    The dataset contains 40 hours of dual-channel call center recordings between native Japanese speakers. Captured in realistic customer support settings, these conversations span a wide range of telecom topics from network complaints to billing issues, offering a strong foundation for training and evaluating telecom voice AI solutions.

    Participant Diversity:
    Speakers: 80 native Japanese speakers from our verified contributor pool.
    Regions: Representing multiple provinces across Japan to ensure coverage of various accents and dialects.
    Participant Profile: Balanced gender mix (60% male, 40% female) with age distribution from 18 to 70 years.
    Recording Details:
    Conversation Nature: Naturally flowing, unscripted interactions between agents and customers.
    Call Duration: Ranges from 5 to 15 minutes.
    Audio Format: Stereo WAV files, 16-bit depth, at 8kHz and 16kHz sample rates.
    Recording Environment: Captured in clean conditions with no echo or background noise.

    Topic Diversity

    This speech corpus includes both inbound and outbound calls with varied conversational outcomes like positive, negative, and neutral ensuring broad scenario coverage for telecom AI development.

    Inbound Calls:
    Phone Number Porting
    Network Connectivity Issues
    Billing and Payments
    Technical Support
    Service Activation
    International Roaming Enquiry
    Refund Requests and Billing Adjustments
    Emergency Service Access, and others
    Outbound Calls:
    Welcome Calls & Onboarding
    Payment Reminders
    Customer Satisfaction Surveys
    Technical Updates
    Service Usage Reviews
    Network Complaint Status Calls, and more

    This variety helps train telecom-specific models to manage real-world customer interactions and understand context-specific voice patterns.

    Transcription

    All audio files are accompanied by manually curated, time-coded verbatim transcriptions in JSON format.

    Transcription Includes:
    Speaker-Segmented Dialogues
    Time-coded Segments
    Non-speech Tags (e.g., pauses, coughs)
    High transcription accuracy with word error rate < 5% thanks to dual-layered quality checks.

    These transcriptions are production-ready, allowing for faster development of ASR and conversational AI systems in the Telecom domain.

    Metadata

    Rich metadata is available for each participant and conversation:

    Participant Metadata: ID, age, gender, accent, dialect, and location.

  8. F

    Japanese Call Center Data for Realestate AI

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Japanese Call Center Data for Realestate AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/realestate-call-center-conversation-japanese-japan
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    This Japanese Call Center Speech Dataset for the Real Estate industry is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for Japanese -speaking Real Estate customers. With over 40 hours of unscripted, real-world audio, this dataset captures authentic conversations between customers and real estate agents ideal for building robust ASR models.

    Curated by FutureBeeAI, this dataset equips voice AI developers, real estate tech platforms, and NLP researchers with the data needed to create high-accuracy, production-ready models for property-focused use cases.

    Speech Data

    The dataset features 40 hours of dual-channel call center recordings between native Japanese speakers. Captured in realistic real estate consultation and support contexts, these conversations span a wide array of property-related topics from inquiries to investment advice offering deep domain coverage for AI model development.

    Participant Diversity:
    Speakers: 80 native Japanese speakers from our verified contributor community.
    Regions: Representing different provinces across Japan to ensure accent and dialect variation.
    Participant Profile: Balanced gender mix (60% male, 40% female) and age range from 18 to 70.
    Recording Details:
    Conversation Nature: Naturally flowing, unscripted agent-customer discussions.
    Call Duration: Average 5–15 minutes per call.
    Audio Format: Stereo WAV, 16-bit, recorded at 8kHz and 16kHz.
    Recording Environment: Captured in noise-free and echo-free conditions.

    Topic Diversity

    This speech corpus includes both inbound and outbound calls, featuring positive, neutral, and negative outcomes across a wide range of real estate scenarios.

    Inbound Calls:
    Property Inquiries
    Rental Availability
    Renovation Consultation
    Property Features & Amenities
    Investment Property Evaluation
    Ownership History & Legal Info, and more
    Outbound Calls:
    New Listing Notifications
    Post-Purchase Follow-ups
    Property Recommendations
    Value Updates
    Customer Satisfaction Surveys, and others

    Such domain-rich variety ensures model generalization across common real estate support conversations.

    Transcription

    All recordings are accompanied by precise, manually verified transcriptions in JSON format.

    Transcription Includes:
    Speaker-Segmented Dialogues
    Time-coded Segments
    Non-speech Tags (e.g., background noise, pauses)
    High transcription accuracy with word error rate below 5% via dual-layer human review.

    These transcriptions streamline ASR and NLP development for Japanese real estate voice applications.

    Metadata

    Detailed metadata accompanies each participant and conversation:

    Participant Metadata: ID, age, gender, location, accent, and dialect.
    Conversation Metadata: Topic, call type, sentiment, sample rate, and technical details.

    This enables smart filtering, dialect-focused model training, and structured dataset exploration.

    Usage and Applications

    This dataset is ideal for voice AI and NLP systems built for the real estate sector:

  9. m

    Call Center Conversation Speech Datasets in Japanese for Customer Service

    • data.macgence.com
    mp3
    Updated Jul 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Macgence (2024). Call Center Conversation Speech Datasets in Japanese for Customer Service [Dataset]. https://data.macgence.com/dataset/call-center-conversation-speech-datasets-in-japanese-for-customer-service
    Explore at:
    mp3Available download formats
    Dataset updated
    Jul 21, 2024
    Dataset authored and provided by
    Macgence
    License

    https://data.macgence.com/terms-and-conditionshttps://data.macgence.com/terms-and-conditions

    Time period covered
    2025
    Area covered
    Worldwide
    Variables measured
    Outcome, Call Type, Transcriptions, Audio Recordings, Speaker Metadata, Conversation Topics
    Description

    Enhance customer service with Macgence's Japanese call center dataset. Perfect for AI and analytics, delivering precise and actionable insights for innovations!

  10. F

    Japanese Call Center Data for Healthcare AI

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Japanese Call Center Data for Healthcare AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/healthcare-call-center-conversation-japanese-japan
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    This Japanese Call Center Speech Dataset for the Healthcare industry is purpose-built to accelerate the development of Japanese speech recognition, spoken language understanding, and conversational AI systems. With 40 Hours of unscripted, real-world conversations, it delivers the linguistic and contextual depth needed to build high-performance ASR models for medical and wellness-related customer service.

    Created by FutureBeeAI, this dataset empowers voice AI teams, NLP researchers, and data scientists to develop domain-specific models for hospitals, clinics, insurance providers, and telemedicine platforms.

    Speech Data

    The dataset features 40 Hours of dual-channel call center conversations between native Japanese speakers. These recordings cover a variety of healthcare support topics, enabling the development of speech technologies that are contextually aware and linguistically rich.

    Participant Diversity:
    Speakers: 80 verified native Japanese speakers from our contributor community.
    Regions: Diverse provinces across Japan to ensure broad dialectal representation.
    Participant Profile: Age range of 18–70 with a gender mix of 60% male and 40% female.
    RecordingDetails:
    Conversation Nature: Naturally flowing, unscripted conversations.
    Call Duration: Each session ranges between 5 to 15 minutes.
    Audio Format: WAV format, stereo, 16-bit depth at 8kHz and 16kHz sample rates.
    Recording Environment: Captured in clear conditions without background noise or echo.

    Topic Diversity

    The dataset spans inbound and outbound calls, capturing a broad range of healthcare-specific interactions and sentiment types (positive, neutral, negative).

    Inbound Calls:
    Appointment Scheduling
    New Patient Registration
    Surgical Consultation
    Dietary Advice and Consultations
    Insurance Coverage Inquiries
    Follow-up Treatment Requests, and more
    OutboundCalls:
    Appointment Reminders
    Preventive Care Campaigns
    Test Results & Lab Reports
    Health Risk Assessment Calls
    Vaccination Updates
    Wellness Subscription Outreach, and more

    These real-world interactions help build speech models that understand healthcare domain nuances and user intent.

    Transcription

    Every audio file is accompanied by high-quality, manually created transcriptions in JSON format.

    Transcription Includes:
    Speaker-identified Dialogues
    Time-coded Segments
    Non-speech Annotations (e.g., silence, cough)
    High transcription accuracy with word error rate is below 5%, backed by dual-layer QA checks.

    Metadata

    Each conversation and speaker includes detailed metadata to support fine-tuned training and analysis.

    Participant Metadata: ID, gender, age, region, accent, and dialect.
    Conversation Metadata: Topic, sentiment, call type, sample rate, and technical specs.

    Usage and Applications

    This dataset can be used across a range of healthcare and voice AI use cases:

    <b style="font-weight:

  11. F

    Japanese Call Center Data for Travel AI

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Japanese Call Center Data for Travel AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/travel-call-center-conversation-japanese-japan
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Area covered
    Japan
    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    This Japanese Call Center Speech Dataset for the Travel industry is purpose-built to power the next generation of voice AI applications for travel booking, customer support, and itinerary assistance. With over 40 hours of unscripted, real-world conversations, the dataset enables the development of highly accurate speech recognition and natural language understanding models tailored for Japanese -speaking travelers.

    Created by FutureBeeAI, this dataset supports researchers, data scientists, and conversational AI teams in building voice technologies for airlines, travel portals, and hospitality platforms.

    Speech Data

    The dataset includes 40 hours of dual-channel audio recordings between native Japanese speakers engaged in real travel-related customer service conversations. These audio files reflect a wide variety of topics, accents, and scenarios found across the travel and tourism industry.

    Participant Diversity:
    Speakers: 80 native Japanese contributors from our verified pool.
    Regions: Covering multiple Japan provinces to capture accent and dialectal variation.
    Participant Profile: Balanced representation of age (18–70) and gender (60% male, 40% female).
    Recording Details:
    Conversation Nature: Naturally flowing, spontaneous customer-agent calls.
    Call Duration: Between 5 and 15 minutes per session.
    Audio Format: Stereo WAV, 16-bit depth, at 8kHz and 16kHz.
    Recording Environment: Captured in controlled, noise-free, echo-free settings.

    Topic Diversity

    Inbound and outbound conversations span a wide range of real-world travel support situations with varied outcomes (positive, neutral, negative).

    Inbound Calls:
    Booking Assistance
    Destination Information
    Flight Delays or Cancellations
    Support for Disabled Passengers
    Health and Safety Travel Inquiries
    Lost or Delayed Luggage, and more
    Outbound Calls:
    Promotional Travel Offers
    Customer Feedback Surveys
    Booking Confirmations
    Flight Rescheduling Alerts
    Visa Expiry Notifications, and others

    These scenarios help models understand and respond to diverse traveler needs in real-time.

    Transcription

    Each call is accompanied by manually curated, high-accuracy transcriptions in JSON format.

    Transcription Includes:
    Speaker-Segmented Dialogues
    Time-Stamped Segments
    Non-speech Markers (e.g., pauses, coughs)
    High transcription accuracy by dual-layered transcription review ensures word error rate under 5%.

    Metadata

    Extensive metadata enriches each call and speaker for better filtering and AI training:

    Participant Metadata: ID, age, gender, region, accent, and dialect.
    Conversation Metadata: Topic, domain, call type, sentiment, and audio specs.

    Usage and Applications

    This dataset is ideal for a variety of AI use cases in the travel and tourism space:

    ASR Systems: Train Japanese speech-to-text engines for travel platforms.
    <div style="margin-top:10px; margin-bottom: 10px; padding-left: 30px; display:

  12. h

    J-CHAT

    • huggingface.co
    Updated Jun 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SaruLab Speech group (2025). J-CHAT [Dataset]. https://huggingface.co/datasets/sarulab-speech/J-CHAT
    Explore at:
    Dataset updated
    Jun 4, 2025
    Dataset authored and provided by
    SaruLab Speech group
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    J-CHAT is a Japanese large-scale dialogue speech corpus. For the detailed explanation, please see our paper

      PLEASE READ THIS FIRST
    

    [!IMPORTANT] TO USE THIS DATASET, YOU MUST AGREE THAT YOU WILL USE THE DATASET SOLELY FOR THE PURPOSE OF JAPANESE COPYRIGHT ACT ARTICLE 30-4.

      What's new?
    

    [!NOTE] Add transcription of corpus. transcripition are based on reazonspeech-nemo-v2

      How can I use this data for commercial purposes?
    

    Commercial use is not admitted. If you… See the full description on the dataset page: https://huggingface.co/datasets/sarulab-speech/J-CHAT.

  13. h

    Human-to-machine-Japanese-audio-call-center-conversations

    • huggingface.co
    Updated May 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AIxBlock (2025). Human-to-machine-Japanese-audio-call-center-conversations [Dataset]. https://huggingface.co/datasets/AIxBlock/Human-to-machine-Japanese-audio-call-center-conversations
    Explore at:
    Dataset updated
    May 21, 2025
    Authors
    AIxBlock
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Dataset Card for Japanese audio call center human to machine conversations

    This dataset contains synthetic audio conversations in Japanese between human customers and machine agents, simulating real-world call center scenarios

      Dataset Details
    
    
    
    
    
      Dataset Description
    

    Curated by: AIxBlock (aixblock.io) Funded by [optional]: AIxBlock (aixblock.io) Shared by [optional]: AIxBlock (aixblock.io) Language(s) (NLP): Japanese License: Creative Commons Attribution Non… See the full description on the dataset page: https://huggingface.co/datasets/AIxBlock/Human-to-machine-Japanese-audio-call-center-conversations.

  14. g

    Phone Conversations in Japanese

    • gts.ai
    json
    Updated Nov 19, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GTS (2022). Phone Conversations in Japanese [Dataset]. https://gts.ai/case-study/japanese-phone-conversations-data-annotation-services/
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Nov 19, 2022
    Dataset provided by
    GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
    Authors
    GTS
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Explore our insightful case study on Phone Conversations in Japanese, detailing the nuances and strategies for effective communication.

  15. h

    japanese-speech-recognition-dataset

    • huggingface.co
    Updated Mar 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Unidata (2025). japanese-speech-recognition-dataset [Dataset]. https://huggingface.co/datasets/UniDataPro/japanese-speech-recognition-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 18, 2025
    Authors
    Unidata
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Japanese Speech Dataset for recognition task

    Dataset comprises 513 hours of telephone dialogues in Japanese, collected from 878 native speakers across various topics and domains, with an impressive 98% Word Accuracy Rate. It is designed for research in speech recognition, focusing on various recognition models, primarily aimed at meeting the requirements for automatic speech recognition (ASR) systems. By utilizing this dataset, researchers and developers can advance their… See the full description on the dataset page: https://huggingface.co/datasets/UniDataPro/japanese-speech-recognition-dataset.

  16. h

    Japanese-Roleplay

    • huggingface.co
    Updated Jun 16, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    OmniAICreator (2024). Japanese-Roleplay [Dataset]. https://huggingface.co/datasets/OmniAICreator/Japanese-Roleplay
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 16, 2024
    Authors
    OmniAICreator
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Japanese-Roleplay

    This is a dialogue corpus collected from Japanese role-playing forum (commonly known as "なりきりチャット(narikiri chat)"). Each record corresponds to a single thread. The following filtering and cleaning conditions have been applied:

    For all post_content in the posts of each record, remove response anchors. For all post_content in the posts of each record, delete posts where the post_content length is 10 characters or less. If the number of unique poster types in the… See the full description on the dataset page: https://huggingface.co/datasets/OmniAICreator/Japanese-Roleplay.

  17. Japanese Voice Conversation 0

    • kaggle.com
    Updated Jun 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rastya Widya Hapsari (2025). Japanese Voice Conversation 0 [Dataset]. https://www.kaggle.com/datasets/rastyawidyahapsari/japanese-voice-conversation-0
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 17, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Rastya Widya Hapsari
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Rastya Widya Hapsari

    Released under CC0: Public Domain

    Contents

  18. F

    Japanese Call Center Data for Delivery & Logistics AI

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Japanese Call Center Data for Delivery & Logistics AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/delivery-call-center-conversation-japanese-japan
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    This Japanese Call Center Speech Dataset for the Delivery and Logistics industry is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for Japanese-speaking customers. With over 40 hours of real-world, unscripted call center audio, this dataset captures authentic delivery-related conversations essential for training high-performance ASR models.

    Curated by FutureBeeAI, this dataset empowers AI teams, logistics tech providers, and NLP researchers to build accurate, production-ready models for customer support automation in delivery and logistics.

    Speech Data

    The dataset contains 40 hours of dual-channel call center recordings between native Japanese speakers. Captured across various delivery and logistics service scenarios, these conversations cover everything from order tracking to missed delivery resolutions offering a rich, real-world training base for AI models.

    Participant Diversity:
    Speakers: 80 native Japanese speakers from our verified contributor pool.
    Regions: Multiple provinces of Japan for accent and dialect diversity.
    Participant Profile: Balanced gender distribution (60% male, 40% female) with ages ranging from 18 to 70.
    Recording Details:
    Conversation Nature: Naturally flowing, unscripted customer-agent dialogues.
    Call Duration: 5 to 15 minutes on average.
    Audio Format: Stereo WAV, 16-bit depth, recorded at 8kHz and 16kHz.
    Recording Environment: Captured in clean, noise-free, echo-free conditions.

    Topic Diversity

    This speech corpus includes both inbound and outbound delivery-related conversations, covering varied outcomes (positive, negative, neutral) to train adaptable voice models.

    Inbound Calls:
    Order Tracking
    Delivery Complaints
    Undeliverable Addresses
    Return Process Enquiries
    Delivery Method Selection
    Order Modifications, and more
    Outbound Calls:
    Delivery Confirmations
    Subscription Offer Calls
    Incorrect Address Follow-ups
    Missed Delivery Notifications
    Delivery Feedback Surveys
    Out-of-Stock Alerts, and others

    This comprehensive coverage reflects real-world logistics workflows, helping voice AI systems interpret context and intent with precision.

    Transcription

    All recordings come with high-quality, human-generated verbatim transcriptions in JSON format.

    Transcription Includes:
    Speaker-Segmented Dialogues
    Time-coded Segments
    Non-speech Tags (e.g., pauses, noise)
    High transcription accuracy with word error rate under 5% via dual-layer quality checks.

    These transcriptions support fast, reliable model development for Japanese voice AI applications in the delivery sector.

    Metadata

    Detailed metadata is included for each participant and conversation:

    Participant Metadata: ID, age, gender, region, accent, dialect.
    Conversation Metadata: Topic, call type, sentiment, sample rate, and technical attributes.

    This metadata aids in training specialized models, filtering demographics, and running advanced analytics.

    Usage and Applications

    This

  19. F

    Japanese Human-Human Chat Dataset for Conversational AI & NLP

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Japanese Human-Human Chat Dataset for Conversational AI & NLP [Dataset]. https://www.futurebeeai.com/dataset/text-dataset/japanese-general-domain-conversation-text-dataset
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    The Japanese General Domain Chat Dataset is a high-quality, text-based dataset designed to train and evaluate conversational AI, NLP models, and smart assistants in real-world Japanese usage. Collected through FutureBeeAI’s trusted crowd community, this dataset reflects natural, native-level Japanese conversations covering a broad spectrum of everyday topics.

    Conversational Text Data

    This dataset includes over 15000 chat transcripts, each featuring free-flowing dialogue between two native Japanese speakers. The conversations are spontaneous, context-rich, and mimic informal, real-life texting behavior.

    Words per Chat: 300–700
    Turns per Chat: Up to 50 dialogue turns
    Contributors: 200 native Japanese speakers from the FutureBeeAI Crowd Community
    Format: TXT, DOCS, JSON or CSV (customizable)
    Structure: Each record contains the full chat, topic tag, and metadata block

    Diversity and Domain Coverage

    Conversations span a wide variety of general-domain topics to ensure comprehensive model exposure:

    Music, books, and movies
    Health and wellness
    Children and parenting
    Family life and relationships
    Food and cooking
    Education and studying
    Festivals and traditions
    Environment and daily life
    Internet and tech usage
    Childhood memories and casual chatting

    This diversity ensures the dataset is useful across multiple NLP and language understanding applications.

    Linguistic Authenticity

    Chats reflect informal, native-level Japanese usage with:

    Colloquial expressions and local dialect influence
    Domain-relevant terminology
    Language-specific grammar, phrasing, and sentence flow
    Inclusion of realistic details such as names, phone numbers, email addresses, locations, dates, times, local currencies, and culturally grounded references
    Representation of different writing styles and input quirks to ensure training data realism

    Metadata

    Every chat instance is accompanied by structured metadata, which includes:

    Participant Age
    Gender
    Country/Region
    Chat Domain
    Chat Topic
    Dialect

    This metadata supports model filtering, demographic-specific evaluation, and more controlled fine-tuning workflows.

    Data Quality Assurance

    All chat records pass through a rigorous QA process to maintain consistency and accuracy:

    Manual review for content completeness
    Format checks for chat turns and metadata
    Linguistic verification by native speakers
    Removal of inappropriate or unusable samples

    This ensures a clean, reliable dataset ready for high-performance AI model training.

    Applications

    This dataset is ideal for training and evaluating a wide range of text-based AI systems:

    Conversational AI / Chatbots
    Smart assistants and voicebots
    <div

  20. Sales of foreign language conversation schools Japan FY 2014-2023

    • statista.com
    Updated Jul 11, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Sales of foreign language conversation schools Japan FY 2014-2023 [Dataset]. https://www.statista.com/statistics/1199587/japan-sales-foreign-language-conversation-schools/
    Explore at:
    Dataset updated
    Jul 11, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Japan
    Description

    In the fiscal year 2023, the total sales of foreign language conversation schools in Japan amounted to around ***** billion Japanese yen. That year, there were about *** thousand business establishments of such schools in the country.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
FutureBee AI (2022). Japanese General Conversation Speech Dataset for ASR [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/general-conversation-japanese-japan

Japanese General Conversation Speech Dataset for ASR

Japanese General Conversation Speech Corpus

Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License

https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

Dataset funded by
FutureBeeAI
Description

Introduction

Welcome to the Japanese General Conversation Speech Dataset — a rich, linguistically diverse corpus purpose-built to accelerate the development of Japanese speech technologies. This dataset is designed to train and fine-tune ASR systems, spoken language understanding models, and generative voice AI tailored to real-world Japanese communication.

Curated by FutureBeeAI, this 40 hours dataset offers unscripted, spontaneous two-speaker conversations across a wide array of real-life topics. It enables researchers, AI developers, and voice-first product teams to build robust, production-grade Japanese speech models that understand and respond to authentic Japanese accents and dialects.

Speech Data

The dataset comprises 40 hours of high-quality audio, featuring natural, free-flowing dialogue between native speakers of Japanese. These sessions range from informal daily talks to deeper, topic-specific discussions, ensuring variability and context richness for diverse use cases.

Participant Diversity:
Speakers: 80 verified native Japanese speakers from FutureBeeAI’s contributor community.
Regions: Representing various provinces of Japan to ensure dialectal diversity and demographic balance.
Demographics: A balanced gender ratio (60% male, 40% female) with participant ages ranging from 18 to 70 years.
Recording Details:
Conversation Style: Unscripted, spontaneous peer-to-peer dialogues.
Duration: Each conversation ranges from 15 to 60 minutes.
Audio Format: Stereo WAV files, 16-bit depth, recorded at 16kHz sample rate.
Environment: Quiet, echo-free settings with no background noise.

Topic Diversity

The dataset spans a wide variety of everyday and domain-relevant themes. This topic diversity ensures the resulting models are adaptable to broad speech contexts.

Sample Topics Include:
Family & Relationships
Food & Recipes
Education & Career
Healthcare Discussions
Social Issues
Technology & Gadgets
Travel & Local Culture
Shopping & Marketplace Experiences, and many more.

Transcription

Each audio file is paired with a human-verified, verbatim transcription available in JSON format.

Transcription Highlights:
Speaker-segmented dialogues
Time-coded utterances
Non-speech elements (pauses, laughter, etc.)
High transcription accuracy, achieved through double QA pass, average WER < 5%

These transcriptions are production-ready, enabling seamless integration into ASR model pipelines or conversational AI workflows.

Metadata

The dataset comes with granular metadata for both speakers and recordings:

Speaker Metadata: Age, gender, accent, dialect, state/province, and participant ID.
Recording Metadata: Topic, duration, audio format, device type, and sample rate.

Such metadata helps developers fine-tune model training and supports use-case-specific filtering or demographic analysis.

Usage and Applications

This dataset is a versatile resource for multiple Japanese speech and language AI applications:

ASR Development: Train accurate speech-to-text systems for Japanese.
Voice Assistants: Build smart assistants capable of understanding natural Japanese conversations.

Search
Clear search
Close search
Google apps
Main menu