Evaluating and Improving Automatic Speech Recognition Systems for Korean Meteorological Experts

📅 2024-10-24
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the degraded automatic speech recognition (ASR) performance in Korean meteorological applications, caused by domain-specific terminology and linguistic complexity. To support Korean meteorological experts, we develop a specialized spoken-query ASR system. We introduce the first benchmark dataset for spoken queries in the Korean meteorological domain; propose a lightweight text-to-speech (TTS)-driven data augmentation method that improves recognition accuracy of meteorological terms without compromising general ASR capability; and conduct domain adaptation and evaluation based on multilingual Whisper ASR models. Experiments demonstrate a +12.3% absolute improvement in meteorological term recognition accuracy. Furthermore, we publicly release the dataset, evaluation framework, and augmentation methodology—establishing a reusable benchmark and practical technical pathway for vertical-domain ASR development in Korean.

Technology Category

Application Category

📝 Abstract
This paper explores integrating Automatic Speech Recognition (ASR) into natural language query systems to improve weather forecasting efficiency for Korean meteorologists. We address challenges in developing ASR systems for the Korean weather domain, specifically specialized vocabulary and Korean linguistic intricacies. To tackle these issues, we constructed an evaluation dataset of spoken queries recorded by native Korean speakers. Using this dataset, we assessed various configurations of a multilingual ASR model family, identifying performance limitations related to domain-specific terminology. We then implemented a simple text-to-speech-based data augmentation method, which improved the recognition of specialized terms while maintaining general-domain performance. Our contributions include creating a domain-specific dataset, comprehensive ASR model evaluations, and an effective augmentation technique. We believe our work provides a foundation for future advancements in ASR for the Korean weather forecasting domain.
Problem

Research questions and friction points this paper is trying to address.

Improving weather forecasting efficiency using ASR for Korean meteorologists.
Addressing challenges in ASR for Korean weather domain-specific vocabulary.
Developing and evaluating ASR models with domain-specific Korean datasets.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrated ASR for Korean weather queries
Developed domain-specific Korean evaluation dataset
Implemented text-to-speech data augmentation technique
🔎 Similar Papers
No similar papers found.