100+ datasets found
  1. h

    japanese-speech-recognition-dataset

    • huggingface.co
    Updated Aug 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Unidata NLP (2025). japanese-speech-recognition-dataset [Dataset]. https://huggingface.co/datasets/ud-nlp/japanese-speech-recognition-dataset
    Explore at:
    Dataset updated
    Aug 2, 2025
    Authors
    Unidata NLP
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Japanese Telephone Dialogues Dataset - 513 Hours

    Dataset comprises 513 hours of high-quality telephone audio recordings in Japanese, featuring 800+ native speakers and achieving a 95% sentence accuracy rate. Designed for advancing speech recognition models and language processing, this extensive speech data corpus covers diverse topics and domains, making it ideal for training robust automatic speech recognition (ASR) systems. - Get the data

      Dataset characteristics:… See the full description on the dataset page: https://huggingface.co/datasets/ud-nlp/japanese-speech-recognition-dataset.
    
  2. F

    English-Japanese Parallel Corpus for the Environment Domain

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). English-Japanese Parallel Corpus for the Environment Domain [Dataset]. https://www.futurebeeai.com/dataset/parallel-corpora/japanese-english-translated-parallel-corpus-for-environment-domain
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the English-Japanese Bilingual Parallel Corpora Dataset for the Environment domain, a comprehensive collection of professionally translated bilingual text data. This dataset has been carefully curated to support the development of environment-specific language models, machine translation engines, and domain-aware NLP applications.

    Dataset Content

    Volume and Diversity
    Extensive Dataset: Over 50,000 sentence pairs, offering robust coverage for multiple NLP use cases.
    Translator Diversity: Contributions from 200+ native translators, ensuring a wide range of linguistic styles and cultural interpretations.
    Sentence Diversity
    Word Count: Sentences range from 7 to 25 words, optimized for NLP model training.
    Syntactic Variety: Includes simple, compound, and complex sentences.
    Interrogative & Imperative Forms: Reflects real-life usage with both questions and commands.
    Affirmative & Negative Polarity: Covers positive and negative sentence constructions.
    Voice Variation: Features both active and passive voice forms.
    Idiomatic & Figurative Language: Contains metaphors and idioms relevant to environmental discussions.
    Discourse Markers: Includes logical connectors, conjunctions, and transitions to capture natural flow.
    Cross Translation: Bidirectional translation (English→Japanese and Japanese→English) for superior training of bilingual systems.

    Domain-Specific Focus

    Rich Environmental Context
    Industry-Tailored Terminology: Includes technical terms from ecology, conservation, climate science, and sustainability.
    Authentic Expressions: Captures idiomatic language used in environmental discourse, including topics like biodiversity, climate change, and policy.
    Real-World Contexts: Content drawn from impact assessments, scientific research, sustainability reports, and more.
    Cross-Domain Relevance: Contains overlapping content from fields like urban planning, geography, public health, and renewable energy.

    Format & Structure

    Available Formats: Excel (default), with options to convert into JSON, TMX, XML, XLIFF, and more.
    Structure Includes:
    Serial Number
    Unique ID
    Source Sentence
    Source Word Count
    Target Sentence
    Target Word Count

    Applications

    NLP & AI Use Cases
    Machine Translation: Train high-accuracy bilingual translation models for environmental content.
    Text Processing: Improve spellcheckers, grammar tools, predictive typing, and conversational agents focused on environmental topics.
    LLM Training: Fine-tune Large Language Models for: Environmental Q&A, Climate report summarization, Green policy dialogue generation.

    Secure & Ethical Collection

    Built using FutureBeeAI’s secure Yugo platform.
    No PII: The dataset contains no personally identifiable information.
    IP Safe: All content is original and free from copyright or licensing conflicts.
    Fully Confidential: Data remained within a secure environment throughout the

  3. 633 Hours - Japanese Speech Dataset (Mobile Phone Recordings)

    • m.nexdata.ai
    • nexdata.ai
    Updated Feb 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2024). 633 Hours - Japanese Speech Dataset (Mobile Phone Recordings) [Dataset]. https://m.nexdata.ai/datasets/speechrecog/1166?source=Github
    Explore at:
    Dataset updated
    Feb 11, 2024
    Dataset authored and provided by
    Nexdata
    Variables measured
    Format, Country, Speaker, Language, Annotation, Accuracy rate, Recording device, Recording Content, Language(Region) Code, Recording Environment
    Description

    This dataset contains 633 hours of Japanese spontaneous dialogues, dialogues are based on given topics. Transcribed with text content, timestamp, speaker's ID, gender and other attributes. Our dataset was collected from extensive and diversify speakers(around 1000 native speakers), geographicly speaking, enhancing model performance in real and complex tasks like Automatic Speech Recognition (ASR), Text-to-Speech (TTS) systems, and NLP research. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

  4. Data from: Japanese Elder’s Language Index Corpus v2

    • figshare.com
    zip
    Updated Feb 11, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eiji Aramaki (2016). Japanese Elder’s Language Index Corpus v2 [Dataset]. http://doi.org/10.6084/m9.figshare.2082706.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 11, 2016
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Eiji Aramaki
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A corpus database managed by the MedNLP Laboratory, Kyoto University, Japan. This corpus was compiled using data from 22 people aged 74 to 86 years (mean age: 78.32 years; standard deviation [SD]: 3.36) who agreed to provide data for research purposes. This corpus also includes under 74 data (total 30 data).

  5. 234 Hours-Japanese Speech Dataset (Mobile Phone Recordings)

    • m.nexdata.ai
    • nexdata.ai
    Updated May 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2025). 234 Hours-Japanese Speech Dataset (Mobile Phone Recordings) [Dataset]. https://m.nexdata.ai/datasets/speechrecog/58?source=Huggingface
    Explore at:
    Dataset updated
    May 4, 2025
    Dataset authored and provided by
    Nexdata
    Variables measured
    Format, Country, Speaker, Language, Accuracy Rate, Content category, Recording device, Recording condition, Language(Region) Code, Features of annotation
    Description

    This dataset contains 234 hours of Japanese speech audio, collected from monologue based on given scripts, covering 210,000 formal or informal expressions. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(799 Japanese recorded in mixed condition, such as indoor, roadside, restaurant, etc.), geographicly speaking, enhancing model performance in real and complex tasks.Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

  6. Replication dataset and calculations for PIIE PB 15-3, Japanese Investment...

    • piie.com
    Updated Feb 1, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lindsay Oldenski; Theodore H. Moran (2015). Replication dataset and calculations for PIIE PB 15-3, Japanese Investment in the United States: Superior Performance, Increasing Integration, by Lindsay Oldenski and Theodore H. Moran. (2015). [Dataset]. https://www.piie.com/publications/policy-briefs/japanese-investment-united-states-superior-performance-increasing
    Explore at:
    Dataset updated
    Feb 1, 2015
    Dataset provided by
    Peterson Institute for International Economicshttp://www.piie.com/
    Authors
    Lindsay Oldenski; Theodore H. Moran
    Area covered
    United States
    Description

    This data package includes the underlying data and files to replicate the calculations, charts, and tables presented in Japanese Investment in the United States: Superior Performance, Increasing Integration, PIIE Policy Brief 15-3. If you use the data, please cite as: Oldenski, Lindsay, and Theodore H. Moran. (2015). Japanese Investment in the United States: Superior Performance, Increasing Integration. PIIE Policy Brief 15-3. Peterson Institute for International Economics.

  7. E

    Japanese Kids Speech database (Upper Grade)

    • live.european-language-grid.eu
    • catalog.elra.info
    audio format
    Updated Oct 7, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). Japanese Kids Speech database (Upper Grade) [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/2195
    Explore at:
    audio formatAvailable download formats
    Dataset updated
    Oct 7, 2020
    License

    http://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdfhttp://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdf

    http://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdfhttp://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdf

    Description

    The Japanese Kids Speech database (Upper Grade) contains the total recordings of 232 Japanese Kids speakers (104 males and 128 females), from 9 to 13 years’ old (fourth, fifth and sixth graders in elementary school), recorded in quiet rooms using smartphones. This database may be combined with the Japanese Kids Speech database (Lower Grade) also available in the ELRA Catalogue under reference ELRA-S0411.

    Number of speakers, utterances and duration, age are as follows :

    Number of speakers 232 (104 male/128 female)

    Number of utterances (average): 385 utterances per speaker

    Total number of utterances: 89,454

    Age: from 9 to 13 years' old

    Total hours of data: 145.4

    1018 sentences were used. Recordings were made through smartphones and audio data stored in .wav files as sequences of 16KHz Mono, 16 bits, Linear PCM.

    Database:

    ・Audio data: WAV format, 16KHz, 16bit, mono (recorded with smartphone)

    ・Recording scripts: TSV format(tab-delimited), UTF-8 (without BOM)

    ・Transcription data: TSV format(tab-delimited), UTF-8 (without BOM)

    ・Size: 16.2GB

    Number of speakers per age:

    9 years' old: 56 (21 male, 35 female)

    10 years' old: 71 (30 male, 41 female)

    11 years' old: 65 (28 male, 37 female)

    12 years' old: 38 (24 male, 14 female)

    13 years' old: 2 (1 male, 1 female)

    Structure of database:

    ├─ readme.txt

    ├─ Japanese Kids Speech Database.pdf Description document of the database

    ├─ Transcription.tsv Transcription

    ├─ scripts.tsv Script

    └─ voices/ directory of audio data

    ├─ high/ directory of upper grade

    └─(speaker_ID/) directory of speaker ID (six digits)

    └─(audio_file) audio file (WAV format, 16KHz, 16bit, mono)

    File naming conventions of audio files are as follows:

    Field number | Contents | Description | Remarks

    0 | Language ID | “JA” (fixed) | Japanese

    1 | Speaker ID | Six digit | 5XXXXX

    2 | Script ID | HXXXX | XXXX: four digits

    3 | Age | Two digits

    4 | Gender | M: male, F: female

    Filed separation character is “_”.

    For example, if the audio file name is “JA_500002_H0001_10_F.wav, this file has the following meaning:

    JA: Language ID (Japanese)

    500002: speaker ID

    H0001: script ID

    10: age (ten years old)

    F: gender (female)

  8. p

    Japanese language instructors Business Data for United States

    • poidata.io
    csv, json
    Updated Sep 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Business Data Provider (2025). Japanese language instructors Business Data for United States [Dataset]. https://www.poidata.io/report/japanese-language-instructor/united-states
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Sep 23, 2025
    Dataset authored and provided by
    Business Data Provider
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2025
    Area covered
    United States
    Variables measured
    Website URL, Phone Number, Review Count, Business Name, Email Address, Business Hours, Customer Rating, Business Address, Business Categories, Geographic Coordinates
    Description

    Comprehensive dataset containing 60 verified Japanese language instructor businesses in United States with complete contact information, ratings, reviews, and location data.

  9. t

    Corpus of Spontaneous Japanese

    • service.tib.eu
    Updated Jan 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Corpus of Spontaneous Japanese [Dataset]. https://service.tib.eu/ldmservice/dataset/corpus-of-spontaneous-japanese
    Explore at:
    Dataset updated
    Jan 3, 2025
    Description

    The Corpus of Spontaneous Japanese: Its design and evaluation [30] is a dataset of spontaneous Japanese speech.

  10. n

    National Survey of the Japanese Elderly

    • neuinfo.org
    • dknet.org
    • +2more
    Updated Sep 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). National Survey of the Japanese Elderly [Dataset]. http://identifiers.org/RRID:SCR_008971
    Explore at:
    Dataset updated
    Sep 22, 2023
    Description

    A panel data set for use in cross-cultural analyses of aging, health, and well-being between the U.S. and Japan. The questionnaires were designed to be partially comparable to many surveys of the aged, including Americans'' Changing Lives; 1984 National Health Interview Survey Supplement on Aging; Health and Retirement Study (HRS), and Well-Being Among the Aged: Personal Control and Self-Esteem (WBA). NSJE questionnaire topics include: * Demographics (age, sex, marital status, education, employment) * Social Integration (interpersonal contacts, social supports) * Health Limitations on daily life and activities * Health Conditions * Health Status (ratings of present health) * Level of physical activity * Subjective Well-Being and Mental Health Status (life satisfaction, morale), * Psychological Indicators (life events, locus of control, self-esteem) * Financial situation (financial status) * Memory (measures of cognitive functioning) * Interviewer observations (assessments of respondents) The NSJE was based on a national sample of 2,200 noninstitutionalized elderly aged 60+ in Japan. This cohort has been interviewed once every 3 years since 1987. To ensure that the data are representative of the 60+ population, the samples in 1990 and 1996 were refreshed to add individuals aged 60-62. In 1999, a new cohort of Japanese adults aged 70+ was added to the surviving members of previous cohorts to form a database of 3,990 respondents 63+, of which some 3,000 were 70+. Currently a 6-wave longitudinal database (1987, 1990, 1993, 1996, 1999, & 2002) is in place; wave 7 began in 2006. Data Availability: Data from the first three waves of the National Survey of the Japanese Elderly are currently in the public domain and can be obtained from ICPSR. Additional data are being prepared for future public release. * Dates of Study: 1987-2006 * Study Features: Longitudinal, International * Sample Size: ** 1987: 2,200 ** 1990: 2,780 ** 1993: 2,780 ** 1996: ** 1999: 3,990 ** 2002: ** 2006: Links: * 1987 (ICPSR): http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/06842 * 1990 (ICPSR): http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/03407 * 1993 (ICPSR): http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/04145 * 1996 (ICPSR): http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/26621

  11. J

    Japan Vital Statistics: Japanese Only: Natural Increase

    • ceicdata.com
    Updated Nov 15, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2018). Japan Vital Statistics: Japanese Only: Natural Increase [Dataset]. https://www.ceicdata.com/en/japan/vital-statistics/vital-statistics-japanese-only-natural-increase
    Explore at:
    Dataset updated
    Nov 15, 2017
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 1, 2006 - Dec 1, 2017
    Area covered
    Japan
    Variables measured
    Vital Statistics
    Description

    Vital Statistics: Japanese Only: Natural Increase data was reported at -394,373.000 Person in 2017. This records a decrease from the previous number of -330,770.000 Person for 2016. Vital Statistics: Japanese Only: Natural Increase data is updated yearly, averaging 768,649.000 Person from Dec 1947 (Median) to 2017, with 71 observations. The data reached an all-time high of 1,751,194.000 Person in 1949 and a record low of -394,373.000 Person in 2017. Vital Statistics: Japanese Only: Natural Increase data remains active status in CEIC and is reported by Ministry of Health, Labour and Welfare. The data is categorized under Global Database’s Japan – Table JP.G005: Vital Statistics.

  12. R

    Japanese Cedar Dataset

    • universe.roboflow.com
    zip
    Updated Dec 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    bamboo (2024). Japanese Cedar Dataset [Dataset]. https://universe.roboflow.com/bamboo-meovx/japanese-cedar
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 20, 2024
    Dataset authored and provided by
    bamboo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    JC
    Description

    Japanese Cedar

    ## Overview
    
    Japanese Cedar is a dataset for classification tasks - it contains JC annotations for 200 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  13. Japanese Oceanographic Data Center Japan Land Gravity

    • catalog.data.gov
    • ncei.noaa.gov
    Updated Oct 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NOAA National Centers for Environmental Information (Point of Contact) (2024). Japanese Oceanographic Data Center Japan Land Gravity [Dataset]. https://catalog.data.gov/dataset/japanese-oceanographic-data-center-japan-land-gravity2
    Explore at:
    Dataset updated
    Oct 18, 2024
    Dataset provided by
    National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
    National Centers for Environmental Informationhttps://www.ncei.noaa.gov/
    Area covered
    Japan
    Description

    The gravity station data (4,381 records) were compiled by the Japanese Oceanographic Data Center. This data base was received in July 1988. The data are in the 'MGD77' exchange format. Principal gravity parameters include Free-air Anomalies and Observed gravity corrected for Eotvos, drift, and tares. The observed gravity values are referenced to the International Gravity Standardization Net 1971 (IGSN 71). The gravity anomaly computation uses the Geodetic Reference System 1967 (GRS 67) theoretical gravity formula. The data are randomly distributed within the boundaries of Japan.

  14. h

    japanese-tts

    • huggingface.co
    Updated Jul 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abdou Mohamed Naira (2025). japanese-tts [Dataset]. https://huggingface.co/datasets/nairaxo/japanese-tts
    Explore at:
    Dataset updated
    Jul 19, 2025
    Authors
    Abdou Mohamed Naira
    Description

    nairaxo/japanese-tts dataset hosted on Hugging Face and contributed by the HF Datasets community

  15. J

    Japan Vital Statistics (VS): Japanese Only: Marriage: Total

    • ceicdata.com
    Updated Apr 15, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2023). Japan Vital Statistics (VS): Japanese Only: Marriage: Total [Dataset]. https://www.ceicdata.com/en/japan/vital-statistics-marriage/vital-statistics-vs-japanese-only-marriage-total
    Explore at:
    Dataset updated
    Apr 15, 2023
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 1, 2006 - Dec 1, 2017
    Area covered
    Japan
    Description

    Vital Statistics (VS): Japanese Only: Marriage: Total data was reported at 606,863.000 Person in 2017. This records a decrease from the previous number of 620,531.000 Person for 2016. Vital Statistics (VS): Japanese Only: Marriage: Total data is updated yearly, averaging 774,702.000 Person from Dec 1947 (Median) to 2017, with 71 observations. The data reached an all-time high of 1,099,984.000 Person in 1972 and a record low of 606,863.000 Person in 2017. Vital Statistics (VS): Japanese Only: Marriage: Total data remains active status in CEIC and is reported by Ministry of Health, Labour and Welfare. The data is categorized under Global Database’s Japan – Table JP.G010: Vital Statistics: Marriage.

  16. Made-In Country Index: perception of products made in Japan, by country 2017...

    • statista.com
    • tokrwards.com
    Updated Jun 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Umair Bashir (2024). Made-In Country Index: perception of products made in Japan, by country 2017 [Dataset]. https://www.statista.com/topics/2505/japan/
    Explore at:
    Dataset updated
    Jun 24, 2024
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Umair Bashir
    Area covered
    Japan
    Description

    This ranking displays the results of the worldwide Made-In-Country Index 2017, a survey conducted to show how positively products "made in..." are perceived in various countries all over the world. During this survey, all respondents (100 percent) from Vietnam perceived products made in Japan as "slightly positive" or "very positive".

  17. E

    Japanese Speech Recognition Corpus (desktop) – Japanese place name (200...

    • live.european-language-grid.eu
    • catalogue.elra.info
    audio format
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Japanese Speech Recognition Corpus (desktop) – Japanese place name (200 people) [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/2017
    Explore at:
    audio formatAvailable download formats
    License

    http://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdfhttp://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdf

    http://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdfhttp://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdf

    Description

    This corpus comprises 2,000 Japanese place names uttered by 200 speakers of different dialects, ages and various educational levels, recorded over 4 channels. Speech samples are stored as a sequence of 16-bit 48kHz WAV for 3.93 hours of speech per channel. The total capacity of the data is 3.96 Gb. Each speaker read 10 items. Text files are stored in Unicode format. All data have been proofread manually. The corpus aims to be applied to the testing and telephone natural speech recognition system. This corpus is partly included in ELRA-S0228-54.

  18. Leading subscription service formats in Japan 2025

    • statista.com
    Updated Jun 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Leading subscription service formats in Japan 2025 [Dataset]. https://www.statista.com/statistics/1227394/japan-most-common-subscription-models/
    Explore at:
    Dataset updated
    Jun 20, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Feb 1, 2025 - Feb 7, 2025
    Area covered
    Japan
    Description

    The majority of Japanese consumers in Japan, more than ** percent, have not used a recurring service recently, as revealed in a survey conducted in January 2025. The most commonly used subscription service was of the flat-rate model, which allows unlimited-use of online services and digital content as long as it remains within service terms.

  19. Digital publishing market size Japan 2015-2024

    • statista.com
    Updated Feb 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista Research Department (2024). Digital publishing market size Japan 2015-2024 [Dataset]. https://www.statista.com/topics/9291/publishing-industry-in-japan/
    Explore at:
    Dataset updated
    Feb 12, 2024
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Statista Research Department
    Area covered
    Japan
    Description

    The market size of digital publications in Japan was estimated at 566 billion Japanese yen in 2024, which was an increase of 30.9 billion yen compared to the previous year. Digital publishing and print publishing together constitute the larger publishing market.

  20. T

    United States Imports from Japan

    • tradingeconomics.com
    csv, excel, json, xml
    Updated May 29, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2017). United States Imports from Japan [Dataset]. https://tradingeconomics.com/united-states/imports/japan
    Explore at:
    excel, json, xml, csvAvailable download formats
    Dataset updated
    May 29, 2017
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1990 - Dec 31, 2025
    Area covered
    United States
    Description

    United States Imports from Japan was US$152.07 Billion during 2024, according to the United Nations COMTRADE database on international trade. United States Imports from Japan - data, historical chart and statistics - was last updated on September of 2025.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Unidata NLP (2025). japanese-speech-recognition-dataset [Dataset]. https://huggingface.co/datasets/ud-nlp/japanese-speech-recognition-dataset

japanese-speech-recognition-dataset

ud-nlp/japanese-speech-recognition-dataset

Explore at:
Dataset updated
Aug 2, 2025
Authors
Unidata NLP
License

Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically

Description

Japanese Telephone Dialogues Dataset - 513 Hours

Dataset comprises 513 hours of high-quality telephone audio recordings in Japanese, featuring 800+ native speakers and achieving a 95% sentence accuracy rate. Designed for advancing speech recognition models and language processing, this extensive speech data corpus covers diverse topics and domains, making it ideal for training robust automatic speech recognition (ASR) systems. - Get the data

  Dataset characteristics:… See the full description on the dataset page: https://huggingface.co/datasets/ud-nlp/japanese-speech-recognition-dataset.
Search
Clear search
Close search
Google apps
Main menu