SHIELD: Suppressing Hallucinations In LVLM Encoders via Bias and Vulnerability Defense

📅 2025-10-18

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

This work identifies, for the first time, the visual encoder—not the language model—as the root cause of object hallucination in large vision-language models (LVLMs), systematically characterizing three inherent deficiencies: statistical bias, intrinsic bias, and fragility. To address these, we propose SHIELD, a lightweight, training-free, and fine-tuning-free defense framework that applies three decoupled interventions: (i) visual token reweighting to mitigate statistical bias; (ii) noise-derived token injection to suppress intrinsic bias; and (iii) adversarial perturbation combined with contrastive decoding to enhance robustness. Evaluated across diverse LVLM architectures (e.g., LLaVA, Qwen-VL) and standard hallucination benchmarks (POPE, MME-Hallucination), SHIELD reduces hallucination rates by an average of 21.3% while preserving or even improving performance on downstream tasks. Crucially, it demonstrates strong cross-model generalization, requiring no architectural modification or parameter update.

Technology Category

Application Category

📝 Abstract

Large Vision-Language Models (LVLMs) excel in diverse cross-modal tasks. However, object hallucination, where models produce plausible but inaccurate object descriptions, remains a significant challenge. In contrast to previous work focusing on LLM components, this paper is the first to trace LVLM hallucinations to visual encoders and identifies three key issues: statistical bias, inherent bias, and vulnerability. To address these challenges, we propose SHIELD, a training-free framework that mitigates hallucinations through three strategies: re-weighting visual tokens to reduce statistical bias, introducing noise-derived tokens to counter inherent bias, and applying adversarial attacks with contrastive decoding to address vulnerability. Experiments demonstrate that SHIELD effectively mitigates object hallucinations across diverse benchmarks and LVLM families. Moreover, SHIELD achieves strong performance on the general LVLM benchmark, highlighting its broad applicability. Code will be released.

Problem

Research questions and friction points this paper is trying to address.

Addresses object hallucination in vision-language models' visual encoders

Mitigates statistical bias and vulnerability in visual token processing

Reduces inaccurate object descriptions without requiring model retraining

Innovation

Methods, ideas, or system contributions that make the work stand out.

Re-weighting visual tokens to reduce statistical bias

Introducing noise-derived tokens to counter inherent bias

Applying adversarial attacks with contrastive decoding

🔎 Similar Papers

DAMRO: Dive into the Attention Mechanism of LVLM to Reduce Object Hallucination