https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Welcome to the Japanese General Conversation Speech Dataset — a rich, linguistically diverse corpus purpose-built to accelerate the development of Japanese speech technologies. This dataset is designed to train and fine-tune ASR systems, spoken language understanding models, and generative voice AI tailored to real-world Japanese communication.
Curated by FutureBeeAI, this 40 hours dataset offers unscripted, spontaneous two-speaker conversations across a wide array of real-life topics. It enables researchers, AI developers, and voice-first product teams to build robust, production-grade Japanese speech models that understand and respond to authentic Japanese accents and dialects.
The dataset comprises 40 hours of high-quality audio, featuring natural, free-flowing dialogue between native speakers of Japanese. These sessions range from informal daily talks to deeper, topic-specific discussions, ensuring variability and context richness for diverse use cases.
The dataset spans a wide variety of everyday and domain-relevant themes. This topic diversity ensures the resulting models are adaptable to broad speech contexts.
Each audio file is paired with a human-verified, verbatim transcription available in JSON format.
These transcriptions are production-ready, enabling seamless integration into ASR model pipelines or conversational AI workflows.
The dataset comes with granular metadata for both speakers and recordings:
Such metadata helps developers fine-tune model training and supports use-case-specific filtering or demographic analysis.
This dataset is a versatile resource for multiple Japanese speech and language AI applications:
Japanese(Japan) Spontaneous Dialogue Smartphone speech dataset, collected from dialogues based on given topics. Transcribed with text content, timestamp, speaker's ID, gender and other attributes. Our dataset was collected from extensive and diversify speakers(around 1000 native speakers), geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
Dataset Card for japanese photos conversation
Dataset details
This dataset contains multi-turn conversational instructions about images taken in Japan. The images were sourced from https://huggingface.co/datasets/ThePioneer/japanese-photos. We input each image into GPT-4o (gpt-4o-2024-05-13) via the Azure OpenAI API to generate the instruction data. Some of the images in the original dataset were filtered by the Azure OpenAI API when they were input, resulting in a total… See the full description on the dataset page: https://huggingface.co/datasets/llm-jp/japanese-photos-conversation.
We conducted a labeling work on a spoken Japanese dataset (I-JAS) for the text classification, which contains 50 interview dialogues of two-way Japanese conversation that discuss the participants' past present and future. Each dialogue is 30 minutes long. From this dataset, we selected the interview dialogues of native Japanese speakers as the samples. Given the dataset, we annotated sentences with 13 labels. The labeling work was conducted by native Japanese speakers who have experience with data annotation.
labels:
ssi/osi:subjective information soi/ooi:objective information op/sp:plan qu:question ap:apology th:thanking cc:topic changing/closing ag:agreement ds:disagreement re:request pr:proposal su:summarize/reformulate th:other
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Japanese Telephone Dialogues Dataset - 513 Hours
Dataset comprises 513 hours of high-quality telephone audio recordings in Japanese, featuring 800+ native speakers and achieving a 95% sentence accuracy rate. Designed for advancing speech recognition models and language processing, this extensive speech data corpus covers diverse topics and domains, making it ideal for training robust automatic speech recognition (ASR) systems. - Get the data
Dataset characteristics:… See the full description on the dataset page: https://huggingface.co/datasets/ud-nlp/japanese-speech-recognition-dataset.
Japanese(Japan) Children Real-world Casual Conversation and Monologue speech dataset, covers self-media, conversation, live, lecture, variety show and other generic domains, mirrors real-world interactions. Transcribed with text content, speaker's ID, gender, age, accent and other attributes. Our dataset was collected from extensive and diversify speakers(12 years old and younger children), geographicly speaking, enhancing model performance in real and complex tasks.Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
This Japanese Call Center Speech Dataset for the Telecom industry is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for Japanese-speaking telecom customers. Featuring over 40 hours of real-world, unscripted audio, it delivers authentic customer-agent interactions across key telecom support scenarios to help train robust ASR models.
Curated by FutureBeeAI, this dataset empowers voice AI engineers, telecom automation teams, and NLP researchers to build high-accuracy, production-ready models for telecom-specific use cases.
The dataset contains 40 hours of dual-channel call center recordings between native Japanese speakers. Captured in realistic customer support settings, these conversations span a wide range of telecom topics from network complaints to billing issues, offering a strong foundation for training and evaluating telecom voice AI solutions.
This speech corpus includes both inbound and outbound calls with varied conversational outcomes like positive, negative, and neutral ensuring broad scenario coverage for telecom AI development.
This variety helps train telecom-specific models to manage real-world customer interactions and understand context-specific voice patterns.
All audio files are accompanied by manually curated, time-coded verbatim transcriptions in JSON format.
These transcriptions are production-ready, allowing for faster development of ASR and conversational AI systems in the Telecom domain.
Rich metadata is available for each participant and conversation:
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
This Japanese Call Center Speech Dataset for the Real Estate industry is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for Japanese -speaking Real Estate customers. With over 40 hours of unscripted, real-world audio, this dataset captures authentic conversations between customers and real estate agents ideal for building robust ASR models.
Curated by FutureBeeAI, this dataset equips voice AI developers, real estate tech platforms, and NLP researchers with the data needed to create high-accuracy, production-ready models for property-focused use cases.
The dataset features 40 hours of dual-channel call center recordings between native Japanese speakers. Captured in realistic real estate consultation and support contexts, these conversations span a wide array of property-related topics from inquiries to investment advice offering deep domain coverage for AI model development.
This speech corpus includes both inbound and outbound calls, featuring positive, neutral, and negative outcomes across a wide range of real estate scenarios.
Such domain-rich variety ensures model generalization across common real estate support conversations.
All recordings are accompanied by precise, manually verified transcriptions in JSON format.
These transcriptions streamline ASR and NLP development for Japanese real estate voice applications.
Detailed metadata accompanies each participant and conversation:
This enables smart filtering, dialect-focused model training, and structured dataset exploration.
This dataset is ideal for voice AI and NLP systems built for the real estate sector:
https://data.macgence.com/terms-and-conditionshttps://data.macgence.com/terms-and-conditions
Enhance customer service with Macgence's Japanese call center dataset. Perfect for AI and analytics, delivering precise and actionable insights for innovations!
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
This Japanese Call Center Speech Dataset for the Healthcare industry is purpose-built to accelerate the development of Japanese speech recognition, spoken language understanding, and conversational AI systems. With 40 Hours of unscripted, real-world conversations, it delivers the linguistic and contextual depth needed to build high-performance ASR models for medical and wellness-related customer service.
Created by FutureBeeAI, this dataset empowers voice AI teams, NLP researchers, and data scientists to develop domain-specific models for hospitals, clinics, insurance providers, and telemedicine platforms.
The dataset features 40 Hours of dual-channel call center conversations between native Japanese speakers. These recordings cover a variety of healthcare support topics, enabling the development of speech technologies that are contextually aware and linguistically rich.
The dataset spans inbound and outbound calls, capturing a broad range of healthcare-specific interactions and sentiment types (positive, neutral, negative).
These real-world interactions help build speech models that understand healthcare domain nuances and user intent.
Every audio file is accompanied by high-quality, manually created transcriptions in JSON format.
Each conversation and speaker includes detailed metadata to support fine-tuned training and analysis.
This dataset can be used across a range of healthcare and voice AI use cases:
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
This Japanese Call Center Speech Dataset for the Travel industry is purpose-built to power the next generation of voice AI applications for travel booking, customer support, and itinerary assistance. With over 40 hours of unscripted, real-world conversations, the dataset enables the development of highly accurate speech recognition and natural language understanding models tailored for Japanese -speaking travelers.
Created by FutureBeeAI, this dataset supports researchers, data scientists, and conversational AI teams in building voice technologies for airlines, travel portals, and hospitality platforms.
The dataset includes 40 hours of dual-channel audio recordings between native Japanese speakers engaged in real travel-related customer service conversations. These audio files reflect a wide variety of topics, accents, and scenarios found across the travel and tourism industry.
Inbound and outbound conversations span a wide range of real-world travel support situations with varied outcomes (positive, neutral, negative).
These scenarios help models understand and respond to diverse traveler needs in real-time.
Each call is accompanied by manually curated, high-accuracy transcriptions in JSON format.
Extensive metadata enriches each call and speaker for better filtering and AI training:
This dataset is ideal for a variety of AI use cases in the travel and tourism space:
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
J-CHAT is a Japanese large-scale dialogue speech corpus. For the detailed explanation, please see our paper
PLEASE READ THIS FIRST
[!IMPORTANT] TO USE THIS DATASET, YOU MUST AGREE THAT YOU WILL USE THE DATASET SOLELY FOR THE PURPOSE OF JAPANESE COPYRIGHT ACT ARTICLE 30-4.
What's new?
[!NOTE] Add transcription of corpus. transcripition are based on reazonspeech-nemo-v2
How can I use this data for commercial purposes?
Commercial use is not admitted. If you… See the full description on the dataset page: https://huggingface.co/datasets/sarulab-speech/J-CHAT.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Dataset Card for Japanese audio call center human to machine conversations
This dataset contains synthetic audio conversations in Japanese between human customers and machine agents, simulating real-world call center scenarios
Dataset Details
Dataset Description
Curated by: AIxBlock (aixblock.io) Funded by [optional]: AIxBlock (aixblock.io) Shared by [optional]: AIxBlock (aixblock.io) Language(s) (NLP): Japanese License: Creative Commons Attribution Non… See the full description on the dataset page: https://huggingface.co/datasets/AIxBlock/Human-to-machine-Japanese-audio-call-center-conversations.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Explore our insightful case study on Phone Conversations in Japanese, detailing the nuances and strategies for effective communication.
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Japanese Speech Dataset for recognition task
Dataset comprises 513 hours of telephone dialogues in Japanese, collected from 878 native speakers across various topics and domains, with an impressive 98% Word Accuracy Rate. It is designed for research in speech recognition, focusing on various recognition models, primarily aimed at meeting the requirements for automatic speech recognition (ASR) systems. By utilizing this dataset, researchers and developers can advance their… See the full description on the dataset page: https://huggingface.co/datasets/UniDataPro/japanese-speech-recognition-dataset.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Japanese-Roleplay
This is a dialogue corpus collected from Japanese role-playing forum (commonly known as "なりきりチャット(narikiri chat)"). Each record corresponds to a single thread. The following filtering and cleaning conditions have been applied:
For all post_content in the posts of each record, remove response anchors. For all post_content in the posts of each record, delete posts where the post_content length is 10 characters or less. If the number of unique poster types in the… See the full description on the dataset page: https://huggingface.co/datasets/OmniAICreator/Japanese-Roleplay.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by Rastya Widya Hapsari
Released under CC0: Public Domain
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
This Japanese Call Center Speech Dataset for the Delivery and Logistics industry is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for Japanese-speaking customers. With over 40 hours of real-world, unscripted call center audio, this dataset captures authentic delivery-related conversations essential for training high-performance ASR models.
Curated by FutureBeeAI, this dataset empowers AI teams, logistics tech providers, and NLP researchers to build accurate, production-ready models for customer support automation in delivery and logistics.
The dataset contains 40 hours of dual-channel call center recordings between native Japanese speakers. Captured across various delivery and logistics service scenarios, these conversations cover everything from order tracking to missed delivery resolutions offering a rich, real-world training base for AI models.
This speech corpus includes both inbound and outbound delivery-related conversations, covering varied outcomes (positive, negative, neutral) to train adaptable voice models.
This comprehensive coverage reflects real-world logistics workflows, helping voice AI systems interpret context and intent with precision.
All recordings come with high-quality, human-generated verbatim transcriptions in JSON format.
These transcriptions support fast, reliable model development for Japanese voice AI applications in the delivery sector.
Detailed metadata is included for each participant and conversation:
This metadata aids in training specialized models, filtering demographics, and running advanced analytics.
This
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
The Japanese General Domain Chat Dataset is a high-quality, text-based dataset designed to train and evaluate conversational AI, NLP models, and smart assistants in real-world Japanese usage. Collected through FutureBeeAI’s trusted crowd community, this dataset reflects natural, native-level Japanese conversations covering a broad spectrum of everyday topics.
This dataset includes over 15000 chat transcripts, each featuring free-flowing dialogue between two native Japanese speakers. The conversations are spontaneous, context-rich, and mimic informal, real-life texting behavior.
Conversations span a wide variety of general-domain topics to ensure comprehensive model exposure:
This diversity ensures the dataset is useful across multiple NLP and language understanding applications.
Chats reflect informal, native-level Japanese usage with:
Every chat instance is accompanied by structured metadata, which includes:
This metadata supports model filtering, demographic-specific evaluation, and more controlled fine-tuning workflows.
All chat records pass through a rigorous QA process to maintain consistency and accuracy:
This ensures a clean, reliable dataset ready for high-performance AI model training.
This dataset is ideal for training and evaluating a wide range of text-based AI systems:
In the fiscal year 2023, the total sales of foreign language conversation schools in Japan amounted to around ***** billion Japanese yen. That year, there were about *** thousand business establishments of such schools in the country.
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Welcome to the Japanese General Conversation Speech Dataset — a rich, linguistically diverse corpus purpose-built to accelerate the development of Japanese speech technologies. This dataset is designed to train and fine-tune ASR systems, spoken language understanding models, and generative voice AI tailored to real-world Japanese communication.
Curated by FutureBeeAI, this 40 hours dataset offers unscripted, spontaneous two-speaker conversations across a wide array of real-life topics. It enables researchers, AI developers, and voice-first product teams to build robust, production-grade Japanese speech models that understand and respond to authentic Japanese accents and dialects.
The dataset comprises 40 hours of high-quality audio, featuring natural, free-flowing dialogue between native speakers of Japanese. These sessions range from informal daily talks to deeper, topic-specific discussions, ensuring variability and context richness for diverse use cases.
The dataset spans a wide variety of everyday and domain-relevant themes. This topic diversity ensures the resulting models are adaptable to broad speech contexts.
Each audio file is paired with a human-verified, verbatim transcription available in JSON format.
These transcriptions are production-ready, enabling seamless integration into ASR model pipelines or conversational AI workflows.
The dataset comes with granular metadata for both speakers and recordings:
Such metadata helps developers fine-tune model training and supports use-case-specific filtering or demographic analysis.
This dataset is a versatile resource for multiple Japanese speech and language AI applications: