Analyzing Error Propagation in Korean Spoken QA with ASR-LLM Cascades

πŸ“… 2026-05-17
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

183K/year
πŸ€– AI Summary
This study investigates the propagation of automatic speech recognition (ASR) errors in Korean spoken question-answering systems that employ an ASR–large language model (LLM) cascade, and the resulting semantic failures. Through ASR error analysis, semantic failure evaluation, and comparative experiments with end-to-end audio-language models, the authors demonstrate that even single-character ASR errors can cause complete downstream QA failure. They find that information loss during ASR is the primary driver of performance degradation, and LLMs of varying capabilities exhibit similar sensitivity to such errors. The results indicate that end-to-end models directly processing audio inputs significantly outperform conventional cascaded architectures in noisy conditions, effectively mitigating semantic information loss caused by transcription errors.
πŸ“ Abstract
We analyze how automatic speech recognition (ASR) errors propagate through ASR-LLM cascades in Korean spoken question answering (SQA), focusing on downstream semantic failures that conventional ASR metrics cannot fully capture. Our analysis shows that the relative downstream degradation caused by ASR errors is consistent across LLMs with different absolute performance, suggesting that cascade degradation largely tracks ASR-stage information loss. We further identify single-character Korean ASR errors as a distinct semantic-failure channel, where the gold answer becomes entirely absent from the downstream prediction despite only a minimal transcription difference. Finally, an auxiliary comparison shows that a large audio language model outperforms an ASR-LLM pipeline with a matched language backbone in noisy Korean SQA, indicating the potential of direct audio input to mitigate transcript-induced information loss.
Problem

Research questions and friction points this paper is trying to address.

error propagation
Korean spoken QA
ASR-LLM cascades
semantic failure
automatic speech recognition
Innovation

Methods, ideas, or system contributions that make the work stand out.

error propagation
ASR-LLM cascade
Korean spoken QA
semantic failure
audio language model
πŸ”Ž Similar Papers
No similar papers found.