Language Bias in LVLMs: From In-Depth Analysis to Simple and Effective Mitigation

📅 2026-05-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the prevalent issue of language bias in large vision-language models (LVLMs), which often leads to hallucinated outputs misaligned with visual content due to overreliance on textual cues. The study reveals, for the first time, that this bias stems from modality misalignment during training—particularly from excessive optimization toward textual performance in visual instruction tuning (VIT) and direct preference optimization (DPO). To mitigate this, the authors propose two lightweight, data- and auxiliary-model-free mechanisms: Language Bias Regularization (LBR) during instruction tuning and Language Bias Penalty (LBP) within DPO. Experimental results demonstrate that LBR consistently enhances performance across more than ten general benchmarks, while LBP substantially reduces hallucination rates. Together, these methods effectively improve multimodal alignment and the factual reliability of LVLM-generated responses.
📝 Abstract
Large Vision-Language Models (LVLMs) extend large language models with visual understanding, but remain vulnerable to hallucination, where outputs are fluent yet inconsistent with images. Recent studies link this issue to language bias-the tendency of LVLMs to over-rely on text while neglecting visual inputs. Yet most analyses remain empirical without uncovering its underlying cause. In this paper, we provide a systematic study of language bias and identify its root in modality misalignment during training. Our analysis shows that both Visual Instruction Tuning (VIT) and Direct Preference Optimization (DPO) often prioritize textual improvements, which may cause LVLMs to overly lean toward language modeling rather than balanced multimodal understanding. To address this, we propose two simple yet effective methods: Language Bias Regularization (LBR) which mitigates language bias through regularization during instruction tuning, and Language Bias Penalty (LBP), which penalizes language bias in the DPO training process. Extensive experiments across diverse models and benchmarks demonstrate the effectiveness of our approach. LBR consistently improves performance on over ten general benchmarks, while LBP significantly reduces hallucination and improves trustworthiness. Together, these methods not only mitigate language bias but also advance the overall alignment of LVLMs, all without introducing any additional data or auxiliary models. Our code is publicly available at https://github.com/lab-klc/LVLM-Language-Bias.
Problem

Research questions and friction points this paper is trying to address.

language bias
hallucination
vision-language models
modality misalignment
multimodal understanding
Innovation

Methods, ideas, or system contributions that make the work stand out.

language bias
modality misalignment
Language Bias Regularization
Language Bias Penalty
vision-language models