π€ AI Summary
This work addresses the persistent challenge of hallucinations in large vision-language models, which often stem from linguistic biases and lack identifiable, stable patterns. To this end, we systematically construct Hallucination-Inducing Images (HII) for the first time, uncovering reproducible hallucination mechanisms under specific scene conditions. Building upon these insights, we introduce the Masked-Object-Hallucination (MOH) benchmark for targeted evaluation. Leveraging synthetic HII data and fine-grained preference annotations, we employ Direct Preference Optimization (DPO) for alignment training, achieving effective hallucination mitigation. Our approach improves performance by up to 38% over current state-of-the-art methods on standard hallucination benchmarks while preserving the modelβs general capabilities.
π Abstract
Large Vision-Language Models (VLMs) have achieved remarkable success across diverse multimodal tasks but remain vulnerable to hallucinations rooted in inherent language bias. Despite recent progress, existing hallucination mitigation methods often overlook the underlying hallucination patterns driven by language bias. In this work, we design a novel pipeline to accurately synthesize Hallucination-Inducing Images (HIIs). Using synthesized HIIs, we reveal a consistent scene-conditioned hallucination pattern: models tend to mention objects that are highly typical of the scene even when visual evidence is removed. To quantify the susceptibility of VLMs to this hallucination pattern, we establish the Masked-Object-Hallucination (MOH) benchmark to rigorously evaluate existing state-of-the-art alignment frameworks. Finally, we leverage HIIs to construct high-quality preference datasets for fine-grained alignment. Experimental results demonstrate that our approach effectively mitigates hallucinations while preserving general model capabilities. Specifically, our method achieves up to a 38% improvement over the current state-of-the-art on standard hallucination benchmarks.