100+ datasets found
  1. s

    Japanese Dataset

    • hmn.shaip.com
    • shaip.com
    Updated Aug 17, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shaip (2024). Japanese Dataset [Dataset]. https://hmn.shaip.com/offerings/speech-data-catalog/japanese-dataset/
    Explore at:
    Dataset updated
    Aug 17, 2024
    Dataset authored and provided by
    Shaip
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Home Japanese Dataset日本語データセットHigh-Quality Japanede TTS Dataset for AI & Speech Models Contact Us OverviewTitleJapanese Language DatasetDataset TypeTTSDescriptionSingle-utterance recordings, which tend to fall in the 5 to 30 second range.Use CaseASR,…

  2. F

    Telecom Call Center Speech Data: Japanese (Japan)

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Telecom Call Center Speech Data: Japanese (Japan) [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/telecom-call-center-conversation-japanese-japan
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the Japanese Call Center Speech Dataset for the Telecom domain designed to enhance the development of call center speech recognition models specifically for the Telecom industry. This dataset is meticulously curated to support advanced speech recognition, natural language processing, conversational AI, and generative voice AI algorithms.

    Speech Data

    This training dataset comprises 40 Hours of call center audio recordings covering various topics and scenarios related to the Telecom domain, designed to build robust and accurate customer service speech technology.

    Participant Diversity:
    Speakers: 80 expert native Japanese speakers from the FutureBeeAI Community.
    Regions: Different states/provinces of Japan, ensuring a balanced representation of Japanese accents, dialects, and demographics.
    Participant Profile: Participants range from 18 to 70 years old, representing both males and females in a 60:40 ratio, respectively.
    Recording Details:
    Conversation Nature: Unscripted and spontaneous conversations between call center agents and customers.
    Call Duration: Average duration of 5 to 15 minutes per call.
    Formats: WAV format with stereo channels, a bit depth of 16 bits, and a sample rate of 8 and 16 kHz.
    Environment: Without background noise and without echo.

    Topic Diversity

    This dataset offers a diverse range of conversation topics, call types, and outcomes, including both inbound and outbound calls with positive, neutral, and negative outcomes.

    Inbound Calls:
    Phone Number Porting
    Network Connectivity Issues
    Billing and Payments
    Technical Support
    Service Activation
    International Roaming Enquiry
    Refunds and Billing Adjustments
    Emergency Service Access, and many more
    Outbound Calls:
    Welcome Calls / Onboarding Process
    Payment Reminders
    Customer Surveys
    Technical Updates
    Service Usage Reviews
    Network Compliant Status Call, and many more

    This extensive coverage ensures the dataset includes realistic call center scenarios, which is essential for developing effective customer support speech recognition models.

    Transcription

    To facilitate your workflow, the dataset includes manual verbatim transcriptions of each call center audio file in JSON format. These transcriptions feature:

    Speaker-wise Segmentation: Time-coded segments for both agents and customers.
    Non-Speech Labels: Tags and labels for non-speech elements.
    Word Error Rate: Word error rate is less than 5% thanks to the dual layer of QA.

    These ready-to-use transcriptions accelerate the development of the Telecom domain call center conversational AI and ASR models for the Japanese language.

    Metadata

    The dataset provides comprehensive metadata for each conversation and participant:

    Participant Metadata: Unique identifier, age, gender, country, state, district, accent and dialect.
    <b

  3. F

    Japanese (Japan) General Conversation Speech Dataset

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Japanese (Japan) General Conversation Speech Dataset [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/general-conversation-japanese-japan
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Area covered
    Japan
    Dataset funded by
    FutureBeeAI
    Description

    What’s Included

    Welcome to the Japanese Language General Conversation Speech Dataset, a comprehensive and diverse collection of voice data specifically curated to advance the development of Japanese language speech recognition models, with a particular focus on Japan accents and dialects.

    With high-quality audio recordings, detailed metadata, and accurate transcriptions, it empowers researchers and developers to enhance natural language processing, conversational AI, and Generative Voice AI algorithms. Moreover, it facilitates the creation of sophisticated voice assistants and voice bots tailored to the unique linguistic nuances found in the Japanese language spoken in Japan.

    Speech Data:

    This training dataset comprises 50 hours of audio recordings covering a wide range of topics and scenarios, ensuring robustness and accuracy in speech technology applications. To achieve this, we collaborated with a diverse network of 70 native Japanese speakers from different states/provinces of Japan. This collaborative effort guarantees a balanced representation of Japan accents, dialects, and demographics, reducing biases and promoting inclusivity.

    Each audio recording captures the essence of spontaneous, unscripted conversations between two individuals, with an average duration ranging from 15 to 60 minutes. The speech data is available in WAV format, with stereo channel files having a bit depth of 16 bits and a sample rate of 8 kHz. The recording environment is generally quiet, without background noise and echo.

    Metadata:

    In addition to the audio recordings, our dataset provides comprehensive metadata for each participant. This metadata includes the participant's age, gender, country, state, and dialect. Furthermore, additional metadata such as recording device detail, topic of recording, bit depth, and sample rate will be provided.

    The metadata serves as a valuable tool for understanding and characterizing the data, facilitating informed decision-making in the development of Japanese language speech recognition models.

    Transcription:

    This dataset provides a manual verbatim transcription of each audio file to enhance your workflow efficiency. The transcriptions are available in JSON format. The transcriptions capture speaker-wise transcription with time-coded segmentation along with non-speech labels and tags.

    Our goal is to expedite the deployment of Japanese language conversational AI and NLP models by offering ready-to-use transcriptions, ultimately saving valuable time and resources in the development process.

    Updates and Customization:

    We understand the importance of collecting data in various environments to build robust ASR models. Therefore, our voice dataset is regularly updated with new audio data captured in diverse real-world conditions.

    If you require a custom training dataset with specific environmental conditions such as in-car, busy street, restaurant, or any other scenario, we can accommodate your request. We can provide voice data with customized sample rates ranging from 8kHz to 48kHz, allowing you to fine-tune your models for different audio recording setups. Additionally, we can also customize the transcription following your specific guidelines and requirements, to further support your ASR development process.

    License:

    This audio dataset, created by FutureBeeAI, is now available for commercial use.

    Conclusion:

    Whether you are training or fine-tuning speech recognition models, advancing NLP algorithms, exploring generative voice AI, or building cutting-edge voice assistants and bots, our dataset serves as a reliable and valuable resource.

  4. Japanese Oceanographic Data Center Japan Land Gravity

    • catalog.data.gov
    • ncei.noaa.gov
    Updated Oct 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NOAA National Centers for Environmental Information (Point of Contact) (2024). Japanese Oceanographic Data Center Japan Land Gravity [Dataset]. https://catalog.data.gov/dataset/japanese-oceanographic-data-center-japan-land-gravity2
    Explore at:
    Dataset updated
    Oct 18, 2024
    Dataset provided by
    National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
    National Centers for Environmental Informationhttps://www.ncei.noaa.gov/
    Area covered
    Japan
    Description

    The gravity station data (4,381 records) were compiled by the Japanese Oceanographic Data Center. This data base was received in July 1988. The data are in the 'MGD77' exchange format. Principal gravity parameters include Free-air Anomalies and Observed gravity corrected for Eotvos, drift, and tares. The observed gravity values are referenced to the International Gravity Standardization Net 1971 (IGSN 71). The gravity anomaly computation uses the Geodetic Reference System 1967 (GRS 67) theoretical gravity formula. The data are randomly distributed within the boundaries of Japan.

  5. F

    English-Japanese translated Parallel Corpora for Environment Domain

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). English-Japanese translated Parallel Corpora for Environment Domain [Dataset]. https://www.futurebeeai.com/dataset/parallel-corpora/japanese-english-translated-parallel-corpus-for-environment-domain
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the English-Japanese Bilingual Parallel Corpora dataset for the Environment domain! This comprehensive dataset contains a vast collection of bilingual text data, carefully translated between English and Japanese, to support the development of environment-specific language models and machine translation engines.

    Dataset Content

    Volume and Diversity:
    Extensive Dataset: Over 50,000 sentences offering a robust dataset for various applications.
    Translator Diversity: Contributions from more than 200 native translators ensure a wide range of linguistic styles and interpretations.
    Sentence Diversity:
    Word Count: Sentences range from 7 to 25 words, suitable for various computational linguistic applications.
    Syntactic Variety: The corpus encompasses sentences with varying syntactic structures, including simple, compound, and complex sentences.
    Interrogative and Imperative Forms: The corpus includes sentences in interrogative (question) and imperative (command) forms, reflecting the conversational nature of the environmental domain.
    Affirmative and Negative Statements: Both affirmative and negative statements are represented in the corpus, ensuring different polarities.
    Passive and Active Voice: The corpus features sentences written in both active and passive voice, ensuring different perspectives and representations of information.
    Idiomatic Expressions and Figurative Language: The corpus incorporates idiomatic expressions, metaphors, and figurative language commonly used in the Environment domain.
    Discourse Markers and Connectives: The corpus includes a wide range of discourse markers and connectives, such as conjunctions, transitional phrases, and logical connectors, which are crucial for capturing the logical flow and coherence of the text.
    Cross Translation: The dataset includes a cross-translation, where a part of the dataset is translated from English to Japanese and another portion is translated from Japanese to English, to improve bi-directional translation capabilities.

    Domain Specific Content

    This Parallel Corpus is meticulously curated to capture the linguistic intricacies and domain-specific nuances inherent to the Environment domain.

    Industry-Tailored Terminology: The corpus encompasses a comprehensive lexicon of Environment-specific terminology, ranging from terms related to ecology, conservation, and sustainability to concepts and theories from various environmental fields.
    Authentic Industry Expressions: Beyond technical terminology, the corpus captures the authentic expressions, idioms, and colloquialisms used within the environmental domain, including discussions on climate change, biodiversity, and environmental policy.
    Contexts Specific to Environment Domain: The corpus encompasses a diverse range of contexts specific to the Environment domain, including environmental impact assessments, scientific articles, sustainability reports, and ecological research papers.
    Cross-Domain Applicability: While the primary focus is on the environmental domain, the corpus also includes relevant cross-domain content from related areas, such as geography, urban planning, public health, and renewable energy.

    Format and Structure

    Multiple Formats: Available in Excel format, with the ability to convert to JSON, TMX, XML, XLIFF, XLS, and other industry-standard formats, facilitating ease of use and integration.
    Structure: It contains information like Serial Number, Unique ID, Source Sentence, Source Sentence Word Count, Target Sentence, and Target Sentence Word Count.

    Usage and Application

    Machine Translation: Develop accurate machine translation engines for environmental content.
    NLP Applications: Improve predictive keyboards, spell checkers, grammar checkers, and text/speech understanding systems tailored for environmental

  6. F

    Japanese Open Ended Classification Prompt & Response Dataset

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Japanese Open Ended Classification Prompt & Response Dataset [Dataset]. https://www.futurebeeai.com/dataset/prompt-response-dataset/japanese-open-ended-classification-text-dataset
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    What’s Included

    Welcome to the Japanese Open Ended Classification Prompt-Response Dataset—an extensive collection of 3000 meticulously curated prompt and response pairs. This dataset is a valuable resource for training Language Models (LMs) to classify input text accurately, a crucial aspect in advancing generative AI.

    Dataset Content:

    This open-ended classification dataset comprises a diverse set of prompts and responses where the prompt contains input text to be classified and may also contain task instruction, context, constraints, and restrictions while completion contains the best classification category as response. Both these prompts and completions are available in Japanese language. As this is an open-ended dataset, there will be no options given to choose the right classification category as a part of the prompt.

    These prompt and completion pairs cover a broad range of topics, including science, history, technology, geography, literature, current affairs, and more. Each prompt is accompanied by a response, providing valuable information and insights to enhance the language model training process. Both the prompt and response were manually curated by native Japanese people, and references were taken from diverse sources like books, news articles, websites, and other reliable references.

    This open-ended classification prompt and completion dataset contains different types of prompts, including instruction type, continuation type, and in-context learning (zero-shot, few-shot) type. The dataset also contains prompts and responses with different types of rich text, including tables, code, JSON, etc., with proper markdown.

    Prompt Diversity:

    To ensure diversity, this open-ended classification dataset includes prompts with varying complexity levels, ranging from easy to medium and hard. Additionally, prompts are diverse in terms of length from short to medium and long, creating a comprehensive variety. The classification dataset also contains prompts with constraints and persona restrictions, which makes it even more useful for LLM training.

    Response Formats:

    To accommodate diverse learning experiences, our dataset incorporates different types of responses depending on the prompt. These formats include single-word, short phrase, and single sentence type of response. These responses encompass text strings, numerical values, and date and time formats, enhancing the language model's ability to generate reliable, coherent, and contextually appropriate answers.

    Data Format and Annotation Details:

    This fully labeled Japanese Open Ended Classification Prompt Completion Dataset is available in JSON and CSV formats. It includes annotation details such as a unique ID, prompt, prompt type, prompt length, prompt complexity, domain, response, response type, and rich text presence.

    Quality and Accuracy:

    Our dataset upholds the highest standards of quality and accuracy. Each prompt undergoes meticulous validation, and the corresponding responses are thoroughly verified. We prioritize inclusivity, ensuring that the dataset incorporates prompts and completions representing diverse perspectives and writing styles, maintaining an unbiased and discrimination-free stance.

    The Japanese version is grammatically accurate without any spelling or grammatical errors. No copyrighted, toxic, or harmful content is used during the construction of this dataset.

    Continuous Updates and Customization:

    The entire dataset was prepared with the assistance of human curators from the FutureBeeAI crowd community. Ongoing efforts are made to add more assets to this dataset, ensuring its growth and relevance. Additionally, FutureBeeAI offers the ability to gather custom open-ended classification prompt and completion data tailored to specific needs, providing flexibility and customization options.

    License:

    The dataset, created by FutureBeeAI, is now available for commercial use. Researchers, data scientists, and developers can leverage this fully labeled and ready-to-deploy Japanese Open Ended Classification Prompt-Completion Dataset to enhance the classification abilities and accurate response generation capabilities of their generative AI models and explore new approaches to NLP tasks.

  7. h

    ja_asr.common_voice_8_0

    • huggingface.co
    Updated Sep 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Japanese ASR (2024). ja_asr.common_voice_8_0 [Dataset]. https://huggingface.co/datasets/japanese-asr/ja_asr.common_voice_8_0
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 27, 2024
    Dataset authored and provided by
    Japanese ASR
    Description

    japanese-asr/ja_asr.common_voice_8_0 dataset hosted on Hugging Face and contributed by the HF Datasets community

  8. Data from: Japanese Elder’s Language Index Corpus v2

    • figshare.com
    zip
    Updated Feb 11, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eiji Aramaki (2016). Japanese Elder’s Language Index Corpus v2 [Dataset]. http://doi.org/10.6084/m9.figshare.2082706.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 11, 2016
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Eiji Aramaki
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A corpus database managed by the MedNLP Laboratory, Kyoto University, Japan. This corpus was compiled using data from 22 people aged 74 to 86 years (mean age: 78.32 years; standard deviation [SD]: 3.36) who agreed to provide data for research purposes. This corpus also includes under 74 data (total 30 data).

  9. h

    Japanese-Novels-23M

    • huggingface.co
    Updated May 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    OmniAICreator (2025). Japanese-Novels-23M [Dataset]. https://huggingface.co/datasets/OmniAICreator/Japanese-Novels-23M
    Explore at:
    Dataset updated
    May 28, 2025
    Authors
    OmniAICreator
    Area covered
    Japan
    Description

    Japanese-Novels-23M

    This dataset contains Japanese web novels that I collected personally. Machine-Learning Use OnlyAccess is restricted to bona fide machine-learning–related purposes.To request access, please provide a detailed explanation of the specific tasks or applications for which you intend to use the dataset.

    Total records: 23,212,809 Total characters: 80,846,120,027 Total tokens (Llama 4 tokenizer): 55,406,468,406 (55.4 B)

  10. m

    Speech corpus datasets in Japanese

    • data.macgence.com
    mp3
    Updated Mar 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Macgence (2024). Speech corpus datasets in Japanese [Dataset]. https://data.macgence.com/dataset/speech-corpus-datasets-in-japanese
    Explore at:
    mp3Available download formats
    Dataset updated
    Mar 30, 2024
    Dataset authored and provided by
    Macgence
    License

    https://data.macgence.com/terms-and-conditionshttps://data.macgence.com/terms-and-conditions

    Time period covered
    2025
    Area covered
    Worldwide
    Variables measured
    Outcome, Call Type, Transcriptions, Audio Recordings, Speaker Metadata, Conversation Topics
    Description

    The audio dataset includes speech corpuses, featuring Japanese speakers from Japan with detailed metadata.

  11. Number of Japanese residents in Hong Kong 2015-2024

    • statista.com
    Updated Jun 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Number of Japanese residents in Hong Kong 2015-2024 [Dataset]. https://www.statista.com/statistics/1084420/japan-number-japanese-residents-hong-kong/
    Explore at:
    Dataset updated
    Jun 2, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Hong Kong, Japan
    Description

    As of October 2024, approximately ****** Japanese residents lived in Hong Kong. Over the past decade, the Japanese population in the city has shown a general downward trend from around ****** residents ten years earlier.

  12. n

    101,702 Japanese Pronunciation Dictionary

    • m.nexdata.ai
    • nexdata.ai
    Updated Jan 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2024). 101,702 Japanese Pronunciation Dictionary [Dataset]. https://m.nexdata.ai/datasets/pronunciation/1088
    Explore at:
    Dataset updated
    Jan 21, 2024
    Dataset provided by
    Nexdata
    nexdata technology inc
    Authors
    Nexdata
    Variables measured
    Format, Language, Data content, Application scenario
    Description

    The data contains 101,702 entries. All words and pronunciations are produced by Japanese linguists. It can be used in the research and development of Japanese ASR technology.

  13. n

    633 Hours - Japanese Conversational Speech by Mobile Phone

    • m.nexdata.ai
    Updated Feb 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2024). 633 Hours - Japanese Conversational Speech by Mobile Phone [Dataset]. https://m.nexdata.ai/datasets/speechrecog/1166?source=Github
    Explore at:
    Dataset updated
    Feb 11, 2024
    Dataset provided by
    Nexdata
    nexdata technology inc
    Authors
    Nexdata
    Variables measured
    Format, Country, Speaker, Language, Annotation, Accuracy rate, Recording device, Recording Content, Language(Region) Code, Recording Environment
    Description

    Japanese(Japan) Spontaneous Dialogue Smartphone speech dataset, collected from dialogues based on given topics. Transcribed with text content, timestamp, speaker's ID, gender and other attributes. Our dataset was collected from extensive and diversify speakers(around 1000 native speakers), geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

  14. Japanese names in Kanji

    • kaggle.com
    Updated Apr 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Keith Lê (2025). Japanese names in Kanji [Dataset]. https://www.kaggle.com/datasets/keith1909/japanese-names-in-kanji
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 5, 2025
    Dataset provided by
    Kaggle
    Authors
    Keith Lê
    License

    https://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/

    Description

    Dataset

    This dataset was created by Keith Lê

    Released under Community Data License Agreement - Sharing - Version 1.0

    Contents

  15. Japan Vital Statistics: Japanese Only: Natural Increase

    • ceicdata.com
    Updated Mar 15, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2018). Japan Vital Statistics: Japanese Only: Natural Increase [Dataset]. https://www.ceicdata.com/en/japan/vital-statistics/vital-statistics-japanese-only-natural-increase
    Explore at:
    Dataset updated
    Mar 15, 2018
    Dataset provided by
    CEIC Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 1, 2006 - Dec 1, 2017
    Area covered
    Japan
    Variables measured
    Vital Statistics
    Description

    Vital Statistics: Japanese Only: Natural Increase data was reported at -394,373.000 Person in 2017. This records a decrease from the previous number of -330,770.000 Person for 2016. Vital Statistics: Japanese Only: Natural Increase data is updated yearly, averaging 768,649.000 Person from Dec 1947 (Median) to 2017, with 71 observations. The data reached an all-time high of 1,751,194.000 Person in 1949 and a record low of -394,373.000 Person in 2017. Vital Statistics: Japanese Only: Natural Increase data remains active status in CEIC and is reported by Ministry of Health, Labour and Welfare. The data is categorized under Global Database’s Japan – Table JP.G005: Vital Statistics.

  16. Replication dataset and calculations for PIIE PB 15-3, Japanese Investment...

    • piie.com
    Updated Feb 1, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lindsay Oldenski; Theodore H. Moran (2015). Replication dataset and calculations for PIIE PB 15-3, Japanese Investment in the United States: Superior Performance, Increasing Integration, by Lindsay Oldenski and Theodore H. Moran. (2015). [Dataset]. https://www.piie.com/publications/policy-briefs/japanese-investment-united-states-superior-performance-increasing
    Explore at:
    Dataset updated
    Feb 1, 2015
    Dataset provided by
    Peterson Institute for International Economicshttp://www.piie.com/
    Authors
    Lindsay Oldenski; Theodore H. Moran
    Area covered
    United States
    Description

    This data package includes the underlying data and files to replicate the calculations, charts, and tables presented in Japanese Investment in the United States: Superior Performance, Increasing Integration, PIIE Policy Brief 15-3. If you use the data, please cite as: Oldenski, Lindsay, and Theodore H. Moran. (2015). Japanese Investment in the United States: Superior Performance, Increasing Integration. PIIE Policy Brief 15-3. Peterson Institute for International Economics.

  17. h

    whisper_transcriptions.reazonspeech.all.wer_10.0.vectorized

    • huggingface.co
    Updated Oct 23, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Japanese ASR (2024). whisper_transcriptions.reazonspeech.all.wer_10.0.vectorized [Dataset]. https://huggingface.co/datasets/japanese-asr/whisper_transcriptions.reazonspeech.all.wer_10.0.vectorized
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 23, 2024
    Dataset authored and provided by
    Japanese ASR
    Description

    japanese-asr/whisper_transcriptions.reazonspeech.all.wer_10.0.vectorized dataset hosted on Hugging Face and contributed by the HF Datasets community

  18. J

    Japan Industrial Machinery: Overseas Sub: No of Employees

    • ceicdata.com
    Updated May 16, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2018). Japan Industrial Machinery: Overseas Sub: No of Employees [Dataset]. https://www.ceicdata.com/en/japan/japanese-business-activities-survey-overseas-sub-major-indicators
    Explore at:
    Dataset updated
    May 16, 2018
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Mar 1, 2015 - Dec 1, 2017
    Area covered
    Japan
    Variables measured
    Business Activity Survey
    Description

    Industrial Machinery: Overseas Sub: No of Employees data was reported at 471,725.000 Person in Mar 2018. This records a decrease from the previous number of 472,111.000 Person for Dec 2017. Industrial Machinery: Overseas Sub: No of Employees data is updated quarterly, averaging 193,903.500 Person from Dec 1996 (Median) to Mar 2018, with 86 observations. The data reached an all-time high of 472,111.000 Person in Dec 2017 and a record low of 78,489.000 Person in Dec 1996. Industrial Machinery: Overseas Sub: No of Employees data remains active status in CEIC and is reported by Ministry of Economy, Trade and Industry. The data is categorized under Global Database’s Japan – Table JP.S059: Japanese Business Activities Survey: Overseas Sub: Major Indicators.

  19. Japan DI Index: Other Mfg: OS: No of Employees

    • ceicdata.com
    Updated Apr 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2023). Japan DI Index: Other Mfg: OS: No of Employees [Dataset]. https://www.ceicdata.com/en/japan/japanese-business-activities-survey-overseas-sub-diffusion-index/di-index-other-mfg-os-no-of-employees
    Explore at:
    Dataset updated
    Apr 15, 2023
    Dataset provided by
    CEIC Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Mar 1, 2015 - Dec 1, 2017
    Area covered
    Japan
    Variables measured
    Business Activity Survey
    Description

    Japan DI Index: Other Mfg: OS: Number of Employees data was reported at 6.700 % in Mar 2018. This records an increase from the previous number of 2.000 % for Dec 2017. Japan DI Index: Other Mfg: OS: Number of Employees data is updated quarterly, averaging 5.550 % from Dec 1996 (Median) to Mar 2018, with 86 observations. The data reached an all-time high of 17.400 % in Mar 2010 and a record low of -29.800 % in Dec 2008. Japan DI Index: Other Mfg: OS: Number of Employees data remains active status in CEIC and is reported by Ministry of Economy, Trade and Industry. The data is categorized under Global Database’s Japan – Table JP.S060: Japanese Business Activities Survey: Overseas Sub: Diffusion Index.

  20. T

    Japan Unemployment Rate

    • tradingeconomics.com
    • pl.tradingeconomics.com
    • +12more
    csv, excel, json, xml
    Updated May 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2025). Japan Unemployment Rate [Dataset]. https://tradingeconomics.com/japan/unemployment-rate
    Explore at:
    csv, xml, excel, jsonAvailable download formats
    Dataset updated
    May 1, 2025
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 31, 1953 - Apr 30, 2025
    Area covered
    Japan
    Description

    Unemployment Rate in Japan remained unchanged at 2.50 percent in April. This dataset provides the latest reported value for - Japan Unemployment Rate - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Shaip (2024). Japanese Dataset [Dataset]. https://hmn.shaip.com/offerings/speech-data-catalog/japanese-dataset/

Japanese Dataset

Explore at:
Dataset updated
Aug 17, 2024
Dataset authored and provided by
Shaip
License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

Home Japanese Dataset日本語データセットHigh-Quality Japanede TTS Dataset for AI & Speech Models Contact Us OverviewTitleJapanese Language DatasetDataset TypeTTSDescriptionSingle-utterance recordings, which tend to fall in the 5 to 30 second range.Use CaseASR,…

Search
Clear search
Close search
Google apps
Main menu