Confidence-Guided Error Correction for Disordered Speech Recognition

📅 2025-09-29

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

Automatic speech recognition (ASR) systems exhibit poor robustness and a tendency toward over-correction when post-processing outputs for speakers with speech impairments. Method: We propose a confidence-guided, fine-grained prompting method that explicitly incorporates word-level ASR confidence scores into the large language model (LLM) post-processing pipeline—specifically, LLaMA-3.1. An uncertainty-aware prompting mechanism directs the LLM’s attention to low-confidence segments, while a subsequent filtering step suppresses excessive corrections. Contribution/Results: This work is the first to integrate ASR confidence as a structured prompt signal across both LLM training and inference stages, significantly enhancing cross-speaker and cross-dataset generalization. Evaluated on the Speech Accessibility Project and TORGO datasets, our method reduces word error rate (WER) by 10% and 47% relatively, outperforming baseline LLM-based correction and post-hoc filtering approaches.

Technology Category

Application Category

📝 Abstract

We investigate the use of large language models (LLMs) as post-processing modules for automatic speech recognition (ASR), focusing on their ability to perform error correction for disordered speech. In particular, we propose confidence-informed prompting, where word-level uncertainty estimates are embedded directly into LLM training to improve robustness and generalization across speakers and datasets. This approach directs the model to uncertain ASR regions and reduces overcorrection. We fine-tune a LLaMA 3.1 model and compare our approach to both transcript-only fine-tuning and post hoc confidence-based filtering. Evaluations show that our method achieves a 10% relative WER reduction compared to naive LLM correction on the Speech Accessibility Project spontaneous speech and a 47% reduction on TORGO, demonstrating the effectiveness of confidence-aware fine-tuning for impaired speech.

Problem

Research questions and friction points this paper is trying to address.

Corrects ASR errors in disordered speech using LLMs

Integrates word-level uncertainty into LLM training

Reduces overcorrection by targeting uncertain ASR regions

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs correct errors in disordered speech recognition

Embed word-level uncertainty into LLM training

Fine-tune LLaMA model with confidence-guided prompting

🔎 Similar Papers

No similar papers found.