When Prompts Override Vision: Prompt-Induced Hallucinations in LVLMs

📅 2026-04-23

📈 Citations: 0

✨ Influential: 0

career value

167K/year

🤖 AI Summary

This work addresses the susceptibility of large vision-language models (LVLMs) to textual prompt-induced hallucinations that contradict visual inputs. It presents the first systematic quantification of how textual instruction priors contribute to such hallucinations, introduces HalluScope—a dedicated evaluation benchmark—and proposes HalluVL-DPO, a framework that leverages human-curated preference data to fine-tune LVLMs via Direct Preference Optimization (DPO). This approach explicitly encourages greater reliance on visual evidence during generation. Experimental results demonstrate that HalluVL-DPO significantly mitigates prompt-induced hallucinations while maintaining or even enhancing overall performance across multiple established vision-language benchmarks.

Technology Category

Application Category

📝 Abstract

Despite impressive progress in capabilities of large vision-language models (LVLMs), these systems remain vulnerable to hallucinations, i.e., outputs that are not grounded in the visual input. Prior work has attributed hallucinations in LVLMs to factors such as limitations of the vision backbone or the dominance of the language component, yet the relative importance of these factors remains unclear. To resolve this ambiguity, We propose HalluScope, a benchmark to better understand the extent to which different factors induce hallucinations. Our analysis indicates that hallucinations largely stem from excessive reliance on textual priors and background knowledge, especially information introduced through textual instructions. To mitigate hallucinations induced by textual instruction priors, we propose HalluVL-DPO, a framework for fine-tuning off-the-shelf LVLMs towards more visually grounded responses. HalluVL-DPO leverages preference optimization using a curated training dataset that we construct, guiding the model to prefer grounded responses over hallucinated ones. We demonstrate that our optimized model effectively mitigates the targeted hallucination failure mode, while preserving or improving performance on other hallucination benchmarks and visual capability evaluations. To support reproducibility and further research, we will publicly release our evaluation benchmark, preference training dataset, and code at https://pegah-kh.github.io/projects/prompts-override-vision/ .

Problem

Research questions and friction points this paper is trying to address.

hallucinations

large vision-language models

textual priors

visual grounding

prompt-induced errors

Innovation

Methods, ideas, or system contributions that make the work stand out.

hallucination mitigation

vision-language models

preference optimization