Audio Hallucination Attacks: Probing the Reliability of Large Audio Language Models

📅 2026-03-31

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

This study addresses the reliability limitations of current large audio-language models, which often generate hallucinated responses unrelated to the input audio. To systematically evaluate and mitigate this issue, the authors propose the Audio Hallucination Attack (AHA) framework, introducing a dual-path attack strategy comprising query-structure induction and synthetic speech injection. They further release AHA-Eval, a benchmark dataset containing 6.5K question-answer pairs, and AHA-Guard, a post-alignment training set with 120K samples designed to enhance model robustness. Experimental results demonstrate that AHA achieves attack success rates of 95.35% on Audio Flamingo 3 and 79.65% on Gemini 3 Pro, while fine-tuning with AHA-Guard reduces these rates by up to 49%, significantly improving model resilience against audio-induced hallucinations.

Technology Category

Application Category

📝 Abstract

Large Audio Language Models (LALMs) achieve strong performance on audio-language tasks; however, their reliability in real-world settings remains underexplored. We introduce Audio Hallucination Attacks (AHA), an attack suite called AHA-Eval, comprising 6.5K QA pairs designed to test whether LALMs genuinely ground their responses in the audio input. AHA targets two attack surfaces: (i) query-based attacks, which exploit question structure to induce hallucinations about absent sounds, and (ii) audio-based attacks, which inject synthetic speech describing non-existent events into the audio stream. Evaluating state-of-the-art LALMs, including Audio Flamingo 3 and Gemini 3 Pro, we observe high attack success rates of 95.35% and 79.65%, respectively, revealing a reliability gap that is hidden by standard benchmark performance. To mitigate this, we propose a 120K QA post-alignment dataset, AHA-Guard, which successfully reduces attack success rates by up to 49%.

Problem

Research questions and friction points this paper is trying to address.

Audio Hallucination

Large Audio Language Models

Reliability

Adversarial Attacks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Audio Hallucination Attacks

Large Audio Language Models

AHA-Eval

AHA-Guard

Reliability Evaluation

🔎 Similar Papers

No similar papers found.

Anthropic

$350,000—$500,000 USD

San Francisco, CA, USA

Research Scientist Intern, Multimodal AI (PhD)