Training-Free Intelligibility-Guided Observation Addition for Noisy ASR

📅 2026-02-24

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

This work addresses the significant performance degradation of automatic speech recognition (ASR) in noisy environments, where conventional speech enhancement frontends often introduce artifacts that impair recognition accuracy. To mitigate this issue without retraining either the enhancement or ASR models, the authors propose an untrained intelligibility-guided fusion strategy that dynamically combines noisy and enhanced speech signals. The method leverages intelligibility estimates derived directly from the ASR backend to generate frame-level or utterance-level fusion weights, enabling a lightweight and model-agnostic observation fusion mechanism. Experimental results demonstrate that the proposed approach consistently outperforms existing observation fusion baselines across diverse enhancement–ASR system combinations and datasets, exhibiting strong robustness and generalization capability.

Technology Category

Application Category

📝 Abstract

Automatic speech recognition (ASR) degrades severely in noisy environments. Although speech enhancement (SE) front-ends effectively suppress background noise, they often introduce artifacts that harm recognition. Observation addition (OA) addressed this issue by fusing noisy and SE enhanced speech, improving recognition without modifying the parameters of the SE or ASR models. This paper proposes an intelligibility-guided OA method, where fusion weights are derived from intelligibility estimates obtained directly from the backend ASR. Unlike prior OA methods based on trained neural predictors, the proposed method is training-free, reducing complexity and enhances generalization. Extensive experiments across diverse SE-ASR combinations and datasets demonstrate strong robustness and improvements over existing OA baselines. Additional analyses of intelligibility-guided switching-based alternatives and frame versus utterance-level OA further validate the proposed design.

Problem

Research questions and friction points this paper is trying to address.

noisy ASR

speech enhancement

observation addition

intelligibility

training-free

Innovation

Methods, ideas, or system contributions that make the work stand out.

training-free

intelligibility-guided

observation addition