69 datasets found

F
Japanese General Conversation Speech Dataset for ASR
futurebeeai.com
wav
Updated Aug 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Japanese General Conversation Speech Dataset for ASR [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/general-conversation-japanese-japan
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Dataset funded by
FutureBeeAI
Description
Introduction
Welcome to the Japanese General Conversation Speech Dataset — a rich, linguistically diverse corpus purpose-built to accelerate the development of Japanese speech technologies. This dataset is designed to train and fine-tune ASR systems, spoken language understanding models, and generative voice AI tailored to real-world Japanese communication.
Curated by FutureBeeAI, this 40 hours dataset offers unscripted, spontaneous two-speaker conversations across a wide array of real-life topics. It enables researchers, AI developers, and voice-first product teams to build robust, production-grade Japanese speech models that understand and respond to authentic Japanese accents and dialects.
Speech Data
The dataset comprises 40 hours of high-quality audio, featuring natural, free-flowing dialogue between native speakers of Japanese. These sessions range from informal daily talks to deeper, topic-specific discussions, ensuring variability and context richness for diverse use cases.
•Participant Diversity:
•
Speakers: 80 verified native Japanese speakers from FutureBeeAI’s contributor community.

•
Regions: Representing various provinces of Japan to ensure dialectal diversity and demographic balance.

•
Demographics: A balanced gender ratio (60% male, 40% female) with participant ages ranging from 18 to 70 years.

•Recording Details:
•
Conversation Style: Unscripted, spontaneous peer-to-peer dialogues.

•
Duration: Each conversation ranges from 15 to 60 minutes.

•
Audio Format: Stereo WAV files, 16-bit depth, recorded at 16kHz sample rate.

•
Environment: Quiet, echo-free settings with no background noise.

Topic Diversity
The dataset spans a wide variety of everyday and domain-relevant themes. This topic diversity ensures the resulting models are adaptable to broad speech contexts.
•Sample Topics Include:
•Family & Relationships
•Food & Recipes
•Education & Career
•Healthcare Discussions
•Social Issues
•Technology & Gadgets
•Travel & Local Culture
•Shopping & Marketplace Experiences, and many more.
Transcription
Each audio file is paired with a human-verified, verbatim transcription available in JSON format.
•Transcription Highlights:
•Speaker-segmented dialogues
•Time-coded utterances
•Non-speech elements (pauses, laughter, etc.)
•High transcription accuracy, achieved through double QA pass, average WER < 5%
These transcriptions are production-ready, enabling seamless integration into ASR model pipelines or conversational AI workflows.
Metadata
The dataset comes with granular metadata for both speakers and recordings:
•
Speaker Metadata: Age, gender, accent, dialect, state/province, and participant ID.

•
Recording Metadata: Topic, duration, audio format, device type, and sample rate.

Such metadata helps developers fine-tune model training and supports use-case-specific filtering or demographic analysis.
Usage and Applications
This dataset is a versatile resource for multiple Japanese speech and language AI applications:
•
ASR Development: Train accurate speech-to-text systems for Japanese.

•
Voice Assistants: Build smart assistants capable of understanding natural Japanese conversations.
633 Hours - Japanese Conversational Speech by Mobile Phone
m.nexdata.ai
nexdata.ai
Updated Feb 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nexdata (2024). 633 Hours - Japanese Conversational Speech by Mobile Phone [Dataset]. https://m.nexdata.ai/datasets/speechrecog/1166?source=Github
Explore at:
Dataset updated
Feb 11, 2024
Dataset authored and provided by
Nexdata
Variables measured
Format, Country, Speaker, Language, Annotation, Accuracy rate, Recording device, Recording Content, Language(Region) Code, Recording Environment
Description
Japanese(Japan) Spontaneous Dialogue Smartphone speech dataset, collected from dialogues based on given topics. Transcribed with text content, timestamp, speaker's ID, gender and other attributes. Our dataset was collected from extensive and diversify speakers(around 1000 native speakers), geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
h
japanese-photos-conversation
huggingface.co
Updated Nov 20, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
LLM-jp (2024). japanese-photos-conversation [Dataset]. https://huggingface.co/datasets/llm-jp/japanese-photos-conversation
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 20, 2024
Dataset authored and provided by
LLM-jp
Description
Dataset Card for japanese photos conversation

Dataset details

This dataset contains multi-turn conversational instructions about images taken in Japan. The images were sourced from https://huggingface.co/datasets/ThePioneer/japanese-photos. We input each image into GPT-4o (gpt-4o-2024-05-13) via the Azure OpenAI API to generate the instruction data. Some of the images in the original dataset were filtered by the Azure OpenAI API when they were input, resulting in a total… See the full description on the dataset page: https://huggingface.co/datasets/llm-jp/japanese-photos-conversation.
Z
Data from: JPS-daprinfo: A Dataset for Japanese Dialog Act Analysis and...
data.niaid.nih.gov
zenodo.org
Updated Mar 10, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hiroshi Ishiguro (2021). JPS-daprinfo: A Dataset for Japanese Dialog Act Analysis and People-related Information Detection [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_4590252
Explore at:
Dataset updated
Mar 10, 2021
Dataset provided by
Carlos Toshinori Ishi
Chaoran Liu
Changzeng Fu
Hiroshi Ishiguro
Description
We conducted a labeling work on a spoken Japanese dataset (I-JAS) for the text classification, which contains 50 interview dialogues of two-way Japanese conversation that discuss the participants' past present and future. Each dialogue is 30 minutes long. From this dataset, we selected the interview dialogues of native Japanese speakers as the samples. Given the dataset, we annotated sentences with 13 labels. The labeling work was conducted by native Japanese speakers who have experience with data annotation.

labels:

ssi/osi:subjective information soi/ooi:objective information op/sp:plan qu:question ap:apology th:thanking cc:topic changing/closing ag:agreement ds:disagreement re:request pr:proposal su:summarize/reformulate th:other
h
japanese-speech-recognition-dataset
huggingface.co
Updated Aug 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Unidata NLP (2025). japanese-speech-recognition-dataset [Dataset]. https://huggingface.co/datasets/ud-nlp/japanese-speech-recognition-dataset
Explore at:
Dataset updated
Aug 2, 2025
Authors
Unidata NLP
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
Japanese Telephone Dialogues Dataset - 513 Hours

Dataset comprises 513 hours of high-quality telephone audio recordings in Japanese, featuring 800+ native speakers and achieving a 95% sentence accuracy rate. Designed for advancing speech recognition models and language processing, this extensive speech data corpus covers diverse topics and domains, making it ideal for training robust automatic speech recognition (ASR) systems. - Get the data

Dataset characteristics:… See the full description on the dataset page: https://huggingface.co/datasets/ud-nlp/japanese-speech-recognition-dataset.
96 Hours - Japanese(Japan) Children Real-world Casual Conversation and...
nexdata.ai
Updated Feb 29, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nexdata (2024). 96 Hours - Japanese(Japan) Children Real-world Casual Conversation and Monologue speech dataset [Dataset]. https://www.nexdata.ai/datasets/speechrecog/1328
Explore at:
Dataset updated
Feb 29, 2024
Dataset authored and provided by
Nexdata
Area covered
World, Japan
Variables measured
Age, Format, Country, Accuracy, Language, Content category, Language(Region) Code, Recording environment, Features of annotation
Description
Japanese(Japan) Children Real-world Casual Conversation and Monologue speech dataset, covers self-media, conversation, live, lecture, variety show and other generic domains, mirrors real-world interactions. Transcribed with text content, speaker's ID, gender, age, accent and other attributes. Our dataset was collected from extensive and diversify speakers(12 years old and younger children), geographicly speaking, enhancing model performance in real and complex tasks.Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
F
Japanese Call Center Data for Telecom AI
futurebeeai.com
wav
Updated Aug 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Japanese Call Center Data for Telecom AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/telecom-call-center-conversation-japanese-japan
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Dataset funded by
FutureBeeAI
Description
Introduction
This Japanese Call Center Speech Dataset for the Telecom industry is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for Japanese-speaking telecom customers. Featuring over 40 hours of real-world, unscripted audio, it delivers authentic customer-agent interactions across key telecom support scenarios to help train robust ASR models.
Curated by FutureBeeAI, this dataset empowers voice AI engineers, telecom automation teams, and NLP researchers to build high-accuracy, production-ready models for telecom-specific use cases.
Speech Data
The dataset contains 40 hours of dual-channel call center recordings between native Japanese speakers. Captured in realistic customer support settings, these conversations span a wide range of telecom topics from network complaints to billing issues, offering a strong foundation for training and evaluating telecom voice AI solutions.
•Participant Diversity:
•
Speakers: 80 native Japanese speakers from our verified contributor pool.

•
Regions: Representing multiple provinces across Japan to ensure coverage of various accents and dialects.

•
Participant Profile: Balanced gender mix (60% male, 40% female) with age distribution from 18 to 70 years.

•Recording Details:
•
Conversation Nature: Naturally flowing, unscripted interactions between agents and customers.

•
Call Duration: Ranges from 5 to 15 minutes.

•
Audio Format: Stereo WAV files, 16-bit depth, at 8kHz and 16kHz sample rates.

•
Recording Environment: Captured in clean conditions with no echo or background noise.

Topic Diversity
This speech corpus includes both inbound and outbound calls with varied conversational outcomes like positive, negative, and neutral ensuring broad scenario coverage for telecom AI development.
•Inbound Calls:
•Phone Number Porting
•Network Connectivity Issues
•Billing and Payments
•Technical Support
•Service Activation
•International Roaming Enquiry
•Refund Requests and Billing Adjustments
•Emergency Service Access, and others
•Outbound Calls:
•Welcome Calls & Onboarding
•Payment Reminders
•Customer Satisfaction Surveys
•Technical Updates
•Service Usage Reviews
•Network Complaint Status Calls, and more
This variety helps train telecom-specific models to manage real-world customer interactions and understand context-specific voice patterns.
Transcription
All audio files are accompanied by manually curated, time-coded verbatim transcriptions in JSON format.
•Transcription Includes:
•Speaker-Segmented Dialogues
•Time-coded Segments
•Non-speech Tags (e.g., pauses, coughs)
•High transcription accuracy with word error rate < 5% thanks to dual-layered quality checks.
These transcriptions are production-ready, allowing for faster development of ASR and conversational AI systems in the Telecom domain.
Metadata
Rich metadata is available for each participant and conversation:
•
Participant Metadata: ID, age, gender, accent, dialect, and location.
F
Japanese Call Center Data for Realestate AI
futurebeeai.com
wav
Updated Aug 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Japanese Call Center Data for Realestate AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/realestate-call-center-conversation-japanese-japan
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Dataset funded by
FutureBeeAI
Description
Introduction
This Japanese Call Center Speech Dataset for the Real Estate industry is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for Japanese -speaking Real Estate customers. With over 40 hours of unscripted, real-world audio, this dataset captures authentic conversations between customers and real estate agents ideal for building robust ASR models.
Curated by FutureBeeAI, this dataset equips voice AI developers, real estate tech platforms, and NLP researchers with the data needed to create high-accuracy, production-ready models for property-focused use cases.
Speech Data
The dataset features 40 hours of dual-channel call center recordings between native Japanese speakers. Captured in realistic real estate consultation and support contexts, these conversations span a wide array of property-related topics from inquiries to investment advice offering deep domain coverage for AI model development.
•Participant Diversity:
•
Speakers: 80 native Japanese speakers from our verified contributor community.

•
Regions: Representing different provinces across Japan to ensure accent and dialect variation.

•
Participant Profile: Balanced gender mix (60% male, 40% female) and age range from 18 to 70.

•Recording Details:
•
Conversation Nature: Naturally flowing, unscripted agent-customer discussions.

•
Call Duration: Average 5–15 minutes per call.

•
Audio Format: Stereo WAV, 16-bit, recorded at 8kHz and 16kHz.

•
Recording Environment: Captured in noise-free and echo-free conditions.

Topic Diversity
This speech corpus includes both inbound and outbound calls, featuring positive, neutral, and negative outcomes across a wide range of real estate scenarios.
•Inbound Calls:
•Property Inquiries
•Rental Availability
•Renovation Consultation
•Property Features & Amenities
•Investment Property Evaluation
•Ownership History & Legal Info, and more
•Outbound Calls:
•New Listing Notifications
•Post-Purchase Follow-ups
•Property Recommendations
•Value Updates
•Customer Satisfaction Surveys, and others
Such domain-rich variety ensures model generalization across common real estate support conversations.
Transcription
All recordings are accompanied by precise, manually verified transcriptions in JSON format.
•Transcription Includes:
•Speaker-Segmented Dialogues
•Time-coded Segments
•Non-speech Tags (e.g., background noise, pauses)
•High transcription accuracy with word error rate below 5% via dual-layer human review.
These transcriptions streamline ASR and NLP development for Japanese real estate voice applications.
Metadata
Detailed metadata accompanies each participant and conversation:
•
Participant Metadata: ID, age, gender, location, accent, and dialect.

•
Conversation Metadata: Topic, call type, sentiment, sample rate, and technical details.

This enables smart filtering, dialect-focused model training, and structured dataset exploration.
Usage and Applications
This dataset is ideal for voice AI and NLP systems built for the real estate sector:
m
Call Center Conversation Speech Datasets in Japanese for Customer Service
data.macgence.com
mp3
Updated Jul 21, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Macgence (2024). Call Center Conversation Speech Datasets in Japanese for Customer Service [Dataset]. https://data.macgence.com/dataset/call-center-conversation-speech-datasets-in-japanese-for-customer-service
Explore at:
mp3Available download formats
Dataset updated
Jul 21, 2024
Dataset authored and provided by
Macgence
License
https://data.macgence.com/terms-and-conditionshttps://data.macgence.com/terms-and-conditions
Time period covered
2025
Area covered
Worldwide
Variables measured
Outcome, Call Type, Transcriptions, Audio Recordings, Speaker Metadata, Conversation Topics
Description
Enhance customer service with Macgence's Japanese call center dataset. Perfect for AI and analytics, delivering precise and actionable insights for innovations!
F
Japanese Call Center Data for Healthcare AI
futurebeeai.com
wav
Updated Aug 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Japanese Call Center Data for Healthcare AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/healthcare-call-center-conversation-japanese-japan
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Dataset funded by
FutureBeeAI
Description
Introduction
This Japanese Call Center Speech Dataset for the Healthcare industry is purpose-built to accelerate the development of Japanese speech recognition, spoken language understanding, and conversational AI systems. With 40 Hours of unscripted, real-world conversations, it delivers the linguistic and contextual depth needed to build high-performance ASR models for medical and wellness-related customer service.
Created by FutureBeeAI, this dataset empowers voice AI teams, NLP researchers, and data scientists to develop domain-specific models for hospitals, clinics, insurance providers, and telemedicine platforms.
Speech Data
The dataset features 40 Hours of dual-channel call center conversations between native Japanese speakers. These recordings cover a variety of healthcare support topics, enabling the development of speech technologies that are contextually aware and linguistically rich.
•Participant Diversity:
•
Speakers: 80 verified native Japanese speakers from our contributor community.

•
Regions: Diverse provinces across Japan to ensure broad dialectal representation.

•
Participant Profile: Age range of 18–70 with a gender mix of 60% male and 40% female.

•RecordingDetails:
•
Conversation Nature: Naturally flowing, unscripted conversations.

•
Call Duration: Each session ranges between 5 to 15 minutes.

•
Audio Format: WAV format, stereo, 16-bit depth at 8kHz and 16kHz sample rates.

•
Recording Environment: Captured in clear conditions without background noise or echo.

Topic Diversity
The dataset spans inbound and outbound calls, capturing a broad range of healthcare-specific interactions and sentiment types (positive, neutral, negative).
•Inbound Calls:
•Appointment Scheduling
•New Patient Registration
•Surgical Consultation
•Dietary Advice and Consultations
•Insurance Coverage Inquiries
•Follow-up Treatment Requests, and more
•OutboundCalls:
•Appointment Reminders
•Preventive Care Campaigns
•Test Results & Lab Reports
•Health Risk Assessment Calls
•Vaccination Updates
•Wellness Subscription Outreach, and more
These real-world interactions help build speech models that understand healthcare domain nuances and user intent.
Transcription
Every audio file is accompanied by high-quality, manually created transcriptions in JSON format.
•Transcription Includes:
•Speaker-identified Dialogues
•Time-coded Segments
•Non-speech Annotations (e.g., silence, cough)
•High transcription accuracy with word error rate is below 5%, backed by dual-layer QA checks.
Metadata
Each conversation and speaker includes detailed metadata to support fine-tuned training and analysis.
•
Participant Metadata: ID, gender, age, region, accent, and dialect.

•
Conversation Metadata: Topic, sentiment, call type, sample rate, and technical specs.

Usage and Applications
This dataset can be used across a range of healthcare and voice AI use cases:
•
<b style="font-weight:
F
Japanese Call Center Data for Travel AI
futurebeeai.com
wav
Updated Aug 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Japanese Call Center Data for Travel AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/travel-call-center-conversation-japanese-japan
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Area covered
Japan
Dataset funded by
FutureBeeAI
Description
Introduction
This Japanese Call Center Speech Dataset for the Travel industry is purpose-built to power the next generation of voice AI applications for travel booking, customer support, and itinerary assistance. With over 40 hours of unscripted, real-world conversations, the dataset enables the development of highly accurate speech recognition and natural language understanding models tailored for Japanese -speaking travelers.
Created by FutureBeeAI, this dataset supports researchers, data scientists, and conversational AI teams in building voice technologies for airlines, travel portals, and hospitality platforms.
Speech Data
The dataset includes 40 hours of dual-channel audio recordings between native Japanese speakers engaged in real travel-related customer service conversations. These audio files reflect a wide variety of topics, accents, and scenarios found across the travel and tourism industry.
•Participant Diversity:
•
Speakers: 80 native Japanese contributors from our verified pool.

•
Regions: Covering multiple Japan provinces to capture accent and dialectal variation.

•
Participant Profile: Balanced representation of age (18–70) and gender (60% male, 40% female).

•Recording Details:
•
Conversation Nature: Naturally flowing, spontaneous customer-agent calls.

•
Call Duration: Between 5 and 15 minutes per session.

•
Audio Format: Stereo WAV, 16-bit depth, at 8kHz and 16kHz.

•
Recording Environment: Captured in controlled, noise-free, echo-free settings.

Topic Diversity
Inbound and outbound conversations span a wide range of real-world travel support situations with varied outcomes (positive, neutral, negative).
•Inbound Calls:
•Booking Assistance
•Destination Information
•Flight Delays or Cancellations
•Support for Disabled Passengers
•Health and Safety Travel Inquiries
•Lost or Delayed Luggage, and more
•Outbound Calls:
•Promotional Travel Offers
•Customer Feedback Surveys
•Booking Confirmations
•Flight Rescheduling Alerts
•Visa Expiry Notifications, and others
These scenarios help models understand and respond to diverse traveler needs in real-time.
Transcription
Each call is accompanied by manually curated, high-accuracy transcriptions in JSON format.
•Transcription Includes:
•Speaker-Segmented Dialogues
•Time-Stamped Segments
•Non-speech Markers (e.g., pauses, coughs)
•High transcription accuracy by dual-layered transcription review ensures word error rate under 5%.
Metadata
Extensive metadata enriches each call and speaker for better filtering and AI training:
•
Participant Metadata: ID, age, gender, region, accent, and dialect.

•
Conversation Metadata: Topic, domain, call type, sentiment, and audio specs.

Usage and Applications
This dataset is ideal for a variety of AI use cases in the travel and tourism space:
•
ASR Systems: Train Japanese speech-to-text engines for travel platforms.

<div style="margin-top:10px; margin-bottom: 10px; padding-left: 30px; display:
h
J-CHAT
huggingface.co
Updated Jun 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SaruLab Speech group (2025). J-CHAT [Dataset]. https://huggingface.co/datasets/sarulab-speech/J-CHAT
Explore at:
Dataset updated
Jun 4, 2025
Dataset authored and provided by
SaruLab Speech group
License
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Description
J-CHAT is a Japanese large-scale dialogue speech corpus. For the detailed explanation, please see our paper

PLEASE READ THIS FIRST

[!IMPORTANT] TO USE THIS DATASET, YOU MUST AGREE THAT YOU WILL USE THE DATASET SOLELY FOR THE PURPOSE OF JAPANESE COPYRIGHT ACT ARTICLE 30-4.

What's new?

[!NOTE] Add transcription of corpus. transcripition are based on reazonspeech-nemo-v2

How can I use this data for commercial purposes?

Commercial use is not admitted. If you… See the full description on the dataset page: https://huggingface.co/datasets/sarulab-speech/J-CHAT.
h
Human-to-machine-Japanese-audio-call-center-conversations
huggingface.co
Updated May 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AIxBlock (2025). Human-to-machine-Japanese-audio-call-center-conversations [Dataset]. https://huggingface.co/datasets/AIxBlock/Human-to-machine-Japanese-audio-call-center-conversations
Explore at:
Dataset updated
May 21, 2025
Authors
AIxBlock
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Dataset Card for Japanese audio call center human to machine conversations

This dataset contains synthetic audio conversations in Japanese between human customers and machine agents, simulating real-world call center scenarios

Dataset Details Dataset Description

Curated by: AIxBlock (aixblock.io) Funded by [optional]: AIxBlock (aixblock.io) Shared by [optional]: AIxBlock (aixblock.io) Language(s) (NLP): Japanese License: Creative Commons Attribution Non… See the full description on the dataset page: https://huggingface.co/datasets/AIxBlock/Human-to-machine-Japanese-audio-call-center-conversations.
g
Phone Conversations in Japanese
gts.ai
json
Updated Nov 19, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GTS (2022). Phone Conversations in Japanese [Dataset]. https://gts.ai/case-study/japanese-phone-conversations-data-annotation-services/
Explore at:
jsonAvailable download formats
Dataset updated
Nov 19, 2022
Dataset provided by
GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
Authors
GTS
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Explore our insightful case study on Phone Conversations in Japanese, detailing the nuances and strategies for effective communication.
h
japanese-speech-recognition-dataset
huggingface.co
Updated Mar 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Unidata (2025). japanese-speech-recognition-dataset [Dataset]. https://huggingface.co/datasets/UniDataPro/japanese-speech-recognition-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 18, 2025
Authors
Unidata
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
Japanese Speech Dataset for recognition task

Dataset comprises 513 hours of telephone dialogues in Japanese, collected from 878 native speakers across various topics and domains, with an impressive 98% Word Accuracy Rate. It is designed for research in speech recognition, focusing on various recognition models, primarily aimed at meeting the requirements for automatic speech recognition (ASR) systems. By utilizing this dataset, researchers and developers can advance their… See the full description on the dataset page: https://huggingface.co/datasets/UniDataPro/japanese-speech-recognition-dataset.
h
Japanese-Roleplay
huggingface.co
Updated Jun 16, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
OmniAICreator (2024). Japanese-Roleplay [Dataset]. https://huggingface.co/datasets/OmniAICreator/Japanese-Roleplay
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 16, 2024
Authors
OmniAICreator
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Japanese-Roleplay

This is a dialogue corpus collected from Japanese role-playing forum (commonly known as "なりきりチャット(narikiri chat)"). Each record corresponds to a single thread. The following filtering and cleaning conditions have been applied:

For all post_content in the posts of each record, remove response anchors. For all post_content in the posts of each record, delete posts where the post_content length is 10 characters or less. If the number of unique poster types in the… See the full description on the dataset page: https://huggingface.co/datasets/OmniAICreator/Japanese-Roleplay.
Japanese Voice Conversation 0
kaggle.com
Updated Jun 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rastya Widya Hapsari (2025). Japanese Voice Conversation 0 [Dataset]. https://www.kaggle.com/datasets/rastyawidyahapsari/japanese-voice-conversation-0
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 17, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Rastya Widya Hapsari
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset

This dataset was created by Rastya Widya Hapsari

Released under CC0: Public Domain

Contents
F
Japanese Call Center Data for Delivery & Logistics AI
futurebeeai.com
wav
Updated Aug 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Japanese Call Center Data for Delivery & Logistics AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/delivery-call-center-conversation-japanese-japan
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Dataset funded by
FutureBeeAI
Description
Introduction
This Japanese Call Center Speech Dataset for the Delivery and Logistics industry is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for Japanese-speaking customers. With over 40 hours of real-world, unscripted call center audio, this dataset captures authentic delivery-related conversations essential for training high-performance ASR models.
Curated by FutureBeeAI, this dataset empowers AI teams, logistics tech providers, and NLP researchers to build accurate, production-ready models for customer support automation in delivery and logistics.
Speech Data
The dataset contains 40 hours of dual-channel call center recordings between native Japanese speakers. Captured across various delivery and logistics service scenarios, these conversations cover everything from order tracking to missed delivery resolutions offering a rich, real-world training base for AI models.
•Participant Diversity:
•
Speakers: 80 native Japanese speakers from our verified contributor pool.

•
Regions: Multiple provinces of Japan for accent and dialect diversity.

•
Participant Profile: Balanced gender distribution (60% male, 40% female) with ages ranging from 18 to 70.

•Recording Details:
•
Conversation Nature: Naturally flowing, unscripted customer-agent dialogues.

•
Call Duration: 5 to 15 minutes on average.

•
Audio Format: Stereo WAV, 16-bit depth, recorded at 8kHz and 16kHz.

•
Recording Environment: Captured in clean, noise-free, echo-free conditions.

Topic Diversity
This speech corpus includes both inbound and outbound delivery-related conversations, covering varied outcomes (positive, negative, neutral) to train adaptable voice models.
•Inbound Calls:
•Order Tracking
•Delivery Complaints
•Undeliverable Addresses
•Return Process Enquiries
•Delivery Method Selection
•Order Modifications, and more
•Outbound Calls:
•Delivery Confirmations
•Subscription Offer Calls
•Incorrect Address Follow-ups
•Missed Delivery Notifications
•Delivery Feedback Surveys
•Out-of-Stock Alerts, and others
This comprehensive coverage reflects real-world logistics workflows, helping voice AI systems interpret context and intent with precision.
Transcription
All recordings come with high-quality, human-generated verbatim transcriptions in JSON format.
•Transcription Includes:
•Speaker-Segmented Dialogues
•Time-coded Segments
•Non-speech Tags (e.g., pauses, noise)
•High transcription accuracy with word error rate under 5% via dual-layer quality checks.
These transcriptions support fast, reliable model development for Japanese voice AI applications in the delivery sector.
Metadata
Detailed metadata is included for each participant and conversation:
•
Participant Metadata: ID, age, gender, region, accent, dialect.

•
Conversation Metadata: Topic, call type, sentiment, sample rate, and technical attributes.

This metadata aids in training specialized models, filtering demographics, and running advanced analytics.
Usage and Applications
This
F
Japanese Human-Human Chat Dataset for Conversational AI & NLP
futurebeeai.com
wav
Updated Aug 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Japanese Human-Human Chat Dataset for Conversational AI & NLP [Dataset]. https://www.futurebeeai.com/dataset/text-dataset/japanese-general-domain-conversation-text-dataset
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Dataset funded by
FutureBeeAI
Description
Introduction
The Japanese General Domain Chat Dataset is a high-quality, text-based dataset designed to train and evaluate conversational AI, NLP models, and smart assistants in real-world Japanese usage. Collected through FutureBeeAI’s trusted crowd community, this dataset reflects natural, native-level Japanese conversations covering a broad spectrum of everyday topics.
Conversational Text Data
This dataset includes over 15000 chat transcripts, each featuring free-flowing dialogue between two native Japanese speakers. The conversations are spontaneous, context-rich, and mimic informal, real-life texting behavior.
•
Words per Chat: 300–700

•
Turns per Chat: Up to 50 dialogue turns

•
Contributors: 200 native Japanese speakers from the FutureBeeAI Crowd Community

•
Format: TXT, DOCS, JSON or CSV (customizable)

•
Structure: Each record contains the full chat, topic tag, and metadata block

Diversity and Domain Coverage
Conversations span a wide variety of general-domain topics to ensure comprehensive model exposure:
•Music, books, and movies
•Health and wellness
•Children and parenting
•Family life and relationships
•Food and cooking
•Education and studying
•Festivals and traditions
•Environment and daily life
•Internet and tech usage
•Childhood memories and casual chatting
This diversity ensures the dataset is useful across multiple NLP and language understanding applications.
Linguistic Authenticity
Chats reflect informal, native-level Japanese usage with:
•Colloquial expressions and local dialect influence
•Domain-relevant terminology
•Language-specific grammar, phrasing, and sentence flow
•Inclusion of realistic details such as names, phone numbers, email addresses, locations, dates, times, local currencies, and culturally grounded references
•Representation of different writing styles and input quirks to ensure training data realism
Metadata
Every chat instance is accompanied by structured metadata, which includes:
•Participant Age
•Gender
•Country/Region
•Chat Domain
•Chat Topic
•Dialect
This metadata supports model filtering, demographic-specific evaluation, and more controlled fine-tuning workflows.
Data Quality Assurance
All chat records pass through a rigorous QA process to maintain consistency and accuracy:
•Manual review for content completeness
•Format checks for chat turns and metadata
•Linguistic verification by native speakers
•Removal of inappropriate or unusable samples
This ensures a clean, reliable dataset ready for high-performance AI model training.
Applications
This dataset is ideal for training and evaluating a wide range of text-based AI systems:
•Conversational AI / Chatbots
•Smart assistants and voicebots
<div
Sales of foreign language conversation schools Japan FY 2014-2023
statista.com
Updated Jul 11, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Sales of foreign language conversation schools Japan FY 2014-2023 [Dataset]. https://www.statista.com/statistics/1199587/japan-sales-foreign-language-conversation-schools/
Explore at:
Dataset updated
Jul 11, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Japan
Description
In the fiscal year 2023, the total sales of foreign language conversation schools in Japan amounted to around ***** billion Japanese yen. That year, there were about *** thousand business establishments of such schools in the country.

Facebook

Twitter

Click to copy link

Link copied

Cite

FutureBee AI (2022). Japanese General Conversation Speech Dataset for ASR [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/general-conversation-japanese-japan

Japanese General Conversation Speech Dataset for ASR

Japanese General Conversation Speech Corpus

Explore at:

wavAvailable download formats

Dataset updated

Aug 1, 2022

Dataset provided by

FutureBeeAI

Authors

FutureBee AI

License

https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

Dataset funded by

FutureBeeAI

Description

Introduction

Welcome to the Japanese General Conversation Speech Dataset — a rich, linguistically diverse corpus purpose-built to accelerate the development of Japanese speech technologies. This dataset is designed to train and fine-tune ASR systems, spoken language understanding models, and generative voice AI tailored to real-world Japanese communication.

Curated by FutureBeeAI, this 40 hours dataset offers unscripted, spontaneous two-speaker conversations across a wide array of real-life topics. It enables researchers, AI developers, and voice-first product teams to build robust, production-grade Japanese speech models that understand and respond to authentic Japanese accents and dialects.

Speech Data

The dataset comprises 40 hours of high-quality audio, featuring natural, free-flowing dialogue between native speakers of Japanese. These sessions range from informal daily talks to deeper, topic-specific discussions, ensuring variability and context richness for diverse use cases.

•Participant Diversity:

•

Speakers: 80 verified native Japanese speakers from FutureBeeAI’s contributor community.

•

Regions: Representing various provinces of Japan to ensure dialectal diversity and demographic balance.

•

Demographics: A balanced gender ratio (60% male, 40% female) with participant ages ranging from 18 to 70 years.

•Recording Details:

•

Conversation Style: Unscripted, spontaneous peer-to-peer dialogues.

•

Duration: Each conversation ranges from 15 to 60 minutes.

•

Audio Format: Stereo WAV files, 16-bit depth, recorded at 16kHz sample rate.

•

Environment: Quiet, echo-free settings with no background noise.

Topic Diversity

The dataset spans a wide variety of everyday and domain-relevant themes. This topic diversity ensures the resulting models are adaptable to broad speech contexts.

•Sample Topics Include:

•Family & Relationships

•Food & Recipes

•Education & Career

•Healthcare Discussions

•Social Issues

•Technology & Gadgets

•Travel & Local Culture

•Shopping & Marketplace Experiences, and many more.

Transcription

Each audio file is paired with a human-verified, verbatim transcription available in JSON format.

•Transcription Highlights:

•Speaker-segmented dialogues

•Time-coded utterances

•Non-speech elements (pauses, laughter, etc.)

•High transcription accuracy, achieved through double QA pass, average WER < 5%

These transcriptions are production-ready, enabling seamless integration into ASR model pipelines or conversational AI workflows.

Metadata

The dataset comes with granular metadata for both speakers and recordings:

•

Speaker Metadata: Age, gender, accent, dialect, state/province, and participant ID.

•

Recording Metadata: Topic, duration, audio format, device type, and sample rate.

Such metadata helps developers fine-tune model training and supports use-case-specific filtering or demographic analysis.

Usage and Applications

This dataset is a versatile resource for multiple Japanese speech and language AI applications:

•

ASR Development: Train accurate speech-to-text systems for Japanese.

•

Voice Assistants: Build smart assistants capable of understanding natural Japanese conversations.

Clear search

Close search

Google apps

Main menu

Japanese General Conversation Speech Dataset for ASR

Introduction

Speech Data

Topic Diversity

Transcription

Metadata

Usage and Applications

633 Hours - Japanese Conversational Speech by Mobile Phone

japanese-photos-conversation

Data from: JPS-daprinfo: A Dataset for Japanese Dialog Act Analysis and...

japanese-speech-recognition-dataset

96 Hours - Japanese(Japan) Children Real-world Casual Conversation and...

Japanese Call Center Data for Telecom AI

Introduction

Speech Data

Topic Diversity

Transcription

Metadata

Japanese Call Center Data for Realestate AI

Introduction

Speech Data

Topic Diversity

Transcription

Metadata

Usage and Applications

Call Center Conversation Speech Datasets in Japanese for Customer Service

Japanese Call Center Data for Healthcare AI

Introduction

Speech Data

Topic Diversity

Transcription

Metadata

Usage and Applications

Japanese Call Center Data for Travel AI

Introduction

Speech Data

Topic Diversity

Transcription

Metadata

Usage and Applications

J-CHAT

Human-to-machine-Japanese-audio-call-center-conversations

Phone Conversations in Japanese

japanese-speech-recognition-dataset

Japanese-Roleplay

Japanese Voice Conversation 0

Dataset

Contents

Japanese Call Center Data for Delivery & Logistics AI

Introduction

Speech Data

Topic Diversity

Transcription

Metadata

Usage and Applications

Japanese Human-Human Chat Dataset for Conversational AI & NLP

Introduction

Conversational Text Data

Diversity and Domain Coverage

Linguistic Authenticity

Metadata

Data Quality Assurance

Applications

Sales of foreign language conversation schools Japan FY 2014-2023

Japanese General Conversation Speech Dataset for ASR

Japanese General Conversation Speech Corpus

Introduction

Speech Data

Topic Diversity

Transcription

Metadata

Usage and Applications