3 datasets found

E
Bangor University's Fine Tuned WhisperCpp Model for Verbatim Welsh Language...
live.european-language-grid.eu
Updated Oct 31, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Language Technologies Unit (2024). Bangor University's Fine Tuned WhisperCpp Model for Verbatim Welsh Language Spontaneous Speech Recognition [Dataset]. https://live.european-language-grid.eu/catalogue/ld/23819
Explore at:
Dataset updated
Oct 31, 2024
Dataset authored and provided by
Language Technologies Unit
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Area covered
Bangor
Description
This model is a version of the openai/whisper-base model, fine-tuned with transcriptions of Welsh language spontaneous speech from Banc Trawsgrifiadau Bangor (btb) dataset, as well as read speech from Welsh Common Voice version 18 (cv) for additional training, and then converted for use in whisper.cpp.
Whispercpp is a C/C++ port of Whisper that provides high performance inference on hardware such as desktops, laptops and mobile devices, thus giving an offline option.
The model is a smaller in size to the corresponding model for hosting on cloud GPU based infrastructure techiaith/whisper-large-v3-ft-btb-cv-cy and thus not as accurate.
It achieves the following WER results for transcribing Welsh language spontaneous speech:
WER: 62.76
CER: 27.70
A
AI-Based Transcription and Captioning Services Report
marketreportanalytics.com
doc, pdf, ppt
Updated Apr 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Market Report Analytics (2025). AI-Based Transcription and Captioning Services Report [Dataset]. https://www.marketreportanalytics.com/reports/ai-based-transcription-and-captioning-services-52532
Explore at:
pdf, ppt, docAvailable download formats
Dataset updated
Apr 2, 2025
Dataset authored and provided by
Market Report Analytics
License
https://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The AI-based transcription and captioning services market is experiencing robust growth, driven by the increasing demand for accessible content and the rising adoption of AI-powered solutions across diverse sectors. The market, currently estimated at $2 billion in 2025, is projected to witness a Compound Annual Growth Rate (CAGR) of 20% from 2025 to 2033, reaching an estimated market value of $8 billion by 2033. This significant expansion is fueled by several key factors. The media and entertainment industry is a major driver, leveraging AI for faster and more accurate transcription of audio and video content, improving accessibility and workflow efficiency. Online education and training platforms are also significant adopters, utilizing AI captioning to cater to diverse learning styles and enhance accessibility for students. The rise of remote work and virtual meetings further boosts demand, necessitating real-time captioning and transcription services for seamless communication. Technological advancements, including improvements in natural language processing (NLP) and speech recognition accuracy, are key enablers of this market growth. The market is segmented by application (Media & Entertainment, Online Education & Training, Meetings & Conferences, Others) and type (Cloud-Based, On-Premises), with the cloud-based segment expected to dominate due to its scalability, cost-effectiveness, and accessibility. Despite the positive outlook, market growth faces certain challenges. Data privacy concerns and the need for robust security measures to protect sensitive information remain significant hurdles. The accuracy of AI-based transcription, especially in noisy environments or with diverse accents, still requires improvement. Furthermore, the high initial investment costs associated with implementing AI-powered solutions can pose a barrier for some businesses, particularly smaller organizations. However, ongoing advancements in AI technology and the increasing affordability of solutions are expected to mitigate these restraints over time. The competitive landscape is dynamic, with established players like IBM and newer entrants like OpenAI Whisper vying for market share. The success of companies in this space hinges on their ability to deliver high-accuracy transcriptions, efficient user interfaces, and robust security features while adapting to the evolving needs of diverse industries.
h
Live-WhisperX-526K
huggingface.co
Updated Apr 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Joya Chen (2025). Live-WhisperX-526K [Dataset]. https://huggingface.co/datasets/chenjoya/Live-WhisperX-526K
Explore at:
Dataset updated
Apr 23, 2025
Authors
Joya Chen
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset Card for Live-WhisperX-526K

Uses

This dataset is used for the training of the LiveCC-7B-Instruct model. We only allow the use of this dataset for academic research and educational purposes. For OpenAI GPT-4o generated user prompts, we recommend users check the OpenAI Usage Policy.

Project Page: https://showlab.github.io/livecc Paper: https://huggingface.co/papers/2504.16030

Data Sources

After we finished the pre-training of LiveCC-7B-Base model… See the full description on the dataset page: https://huggingface.co/datasets/chenjoya/Live-WhisperX-526K.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Language Technologies Unit (2024). Bangor University's Fine Tuned WhisperCpp Model for Verbatim Welsh Language Spontaneous Speech Recognition [Dataset]. https://live.european-language-grid.eu/catalogue/ld/23819

Bangor University's Fine Tuned WhisperCpp Model for Verbatim Welsh Language Spontaneous Speech Recognition

Explore at:

Dataset updated

Oct 31, 2024

Dataset authored and provided by

Language Technologies Unit

License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Area covered

Bangor

Description

This model is a version of the openai/whisper-base model, fine-tuned with transcriptions of Welsh language spontaneous speech from Banc Trawsgrifiadau Bangor (btb) dataset, as well as read speech from Welsh Common Voice version 18 (cv) for additional training, and then converted for use in whisper.cpp.

Whispercpp is a C/C++ port of Whisper that provides high performance inference on hardware such as desktops, laptops and mobile devices, thus giving an offline option.

The model is a smaller in size to the corresponding model for hosting on cloud GPU based infrastructure techiaith/whisper-large-v3-ft-btb-cv-cy and thus not as accurate.

It achieves the following WER results for transcribing Welsh language spontaneous speech:

WER: 62.76
CER: 27.70

Clear search

Close search

Google apps

Main menu

Bangor University's Fine Tuned WhisperCpp Model for Verbatim Welsh Language...

AI-Based Transcription and Captioning Services Report

Live-WhisperX-526K

Bangor University's Fine Tuned WhisperCpp Model for Verbatim Welsh Language Spontaneous Speech Recognition