🤖 AI Summary
This study addresses the reliability limitations of current large audio-language models, which often generate hallucinated responses unrelated to the input audio. To systematically evaluate and mitigate this issue, the authors propose the Audio Hallucination Attack (AHA) framework, introducing a dual-path attack strategy comprising query-structure induction and synthetic speech injection. They further release AHA-Eval, a benchmark dataset containing 6.5K question-answer pairs, and AHA-Guard, a post-alignment training set with 120K samples designed to enhance model robustness. Experimental results demonstrate that AHA achieves attack success rates of 95.35% on Audio Flamingo 3 and 79.65% on Gemini 3 Pro, while fine-tuning with AHA-Guard reduces these rates by up to 49%, significantly improving model resilience against audio-induced hallucinations.
📝 Abstract
Large Audio Language Models (LALMs) achieve strong performance on audio-language tasks; however, their reliability in real-world settings remains underexplored. We introduce Audio Hallucination Attacks (AHA), an attack suite called AHA-Eval, comprising 6.5K QA pairs designed to test whether LALMs genuinely ground their responses in the audio input. AHA targets two attack surfaces: (i) query-based attacks, which exploit question structure to induce hallucinations about absent sounds, and (ii) audio-based attacks, which inject synthetic speech describing non-existent events into the audio stream. Evaluating state-of-the-art LALMs, including Audio Flamingo 3 and Gemini 3 Pro, we observe high attack success rates of 95.35% and 79.65%, respectively, revealing a reliability gap that is hidden by standard benchmark performance. To mitigate this, we propose a 120K QA post-alignment dataset, AHA-Guard, which successfully reduces attack success rates by up to 49%.