🤖 AI Summary
This work addresses the challenge of hallucination detection in speech large language models (SpeechLLMs), where existing methods either rely on costly ground-truth references or fail to capture audio-specific signals. The authors propose the first set of audio-aware attention-based metrics—AUDIORATIO, AUDIOCONSISTENCY, AUDIOENTROPY, and TEXTENTROPY—and demonstrate that a lightweight logistic regression classifier trained on only about 100 attention heads can effectively identify hallucinations. Evaluated on Qwen-2-Audio and Voxtral-3B, the approach substantially outperforms current baselines, achieving up to a 0.23 improvement in in-domain PR-AUC and exhibiting strong generalization to out-of-domain automatic speech recognition tasks. This advances beyond conventional hallucination detection paradigms designed for text-only LLMs by explicitly leveraging multimodal attention patterns inherent to SpeechLLMs.
📝 Abstract
Hallucinations in Speech Large Language Models (SpeechLLMs) pose significant risks, yet existing detection methods typically rely on gold-standard outputs that are costly or impractical to obtain. Moreover, hallucination detection methods developed for text-based LLMs do not directly capture audio-specific signals. We investigate four attention-derived metrics: AUDIORATIO, AUDIOCONSISTENCY, AUDIOENTROPY, and TEXTENTROPY, designed to capture pathological attention patterns associated with hallucination, and train lightweight logistic regression classifiers on these features for efficient inference-time detection. Across automatic speech recognition and speech-to-text translation tasks, evaluations on Qwen-2-Audio and Voxtral-3B show that our approach outperforms uncertainty-based and prior attention-based baselines on in-domain data, achieving improvements of up to +0.23 PR-AUC, and generalises to out-of-domain ASR settings. We further find that strong performance can be achieved with approximately 100 attention heads, improving out-of-domain generalisation compared to using all heads. While effectiveness is model-dependent and task-specific training is required, our results demonstrate that attention patterns provide a valuable tool for hallucination detection in SpeechLLMs.