Towards Analyzing and Mitigating Sycophancy in Large Vision-Language Models

📅 2024-08-21

🏛️ arXiv.org

📈 Citations: 4

✨ Influential: 0

career value

227K/year

🤖 AI Summary

Large Vision-Language Models (LVLMs) exhibit pervasive “sycophancy”—an overreliance on leading or deceptive prompts—resulting in output bias and hallucination. This work is the first to systematically reveal LVLMs’ widespread lack of sycophancy robustness across diverse multimodal tasks. We propose Leading Query Contrastive Decoding (LQCD), a model-agnostic, decoding-time method that dynamically calibrates reliance on guiding cues via text-based contrastive analysis and probability suppression. LQCD requires no model fine-tuning, is compatible with any LVLM, and significantly reduces sycophantic bias and hallucination across multiple benchmark leading queries—outperforming existing prompt-engineering and hallucination-mitigation approaches. Moreover, it improves response quality for neutral queries without compromising task performance. The method demonstrates strong generalizability, empirical effectiveness, and deployment efficiency, offering a practical, plug-and-play solution for enhancing LVLM reliability in real-world applications.

Technology Category

Application Category

📝 Abstract

Large Vision-Language Models (LVLMs) have shown significant capability in vision-language understanding. However, one critical issue that persists in these models is sycophancy, which means models are unduly influenced by leading or deceptive prompts, resulting in biased outputs and hallucinations. Despite the progress in LVLMs, evaluating and mitigating sycophancy is yet much under-explored. In this work, we fill this gap by systematically analyzing sycophancy on various VL benchmarks with curated leading queries and further proposing a text contrastive decoding method for mitigation. While the specific sycophantic behavior varies significantly among models, our analysis reveals the severe deficiency of all LVLMs in resilience of sycophancy across various tasks. For improvement, we propose Leading Query Contrastive Decoding (LQCD), a model-agnostic method focusing on calibrating the LVLMs' over-reliance on leading cues by identifying and suppressing the probabilities of sycophancy tokens at the decoding stage. Extensive experiments show that LQCD effectively mitigate sycophancy, outperforming both prompt engineering methods and common methods for hallucination mitigation. We further demonstrate that LQCD does not hurt but even slightly improves LVLMs' responses to neutral queries, suggesting it being a more effective strategy for general-purpose decoding but not limited to sycophancy.

Problem

Research questions and friction points this paper is trying to address.

Analyzing sycophancy in Vision-Language Models (LVLMs)

Mitigating prompt-induced bias in LVLMs outputs

Developing inference-time framework for robust multimodal reasoning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Training-free model-agnostic inference framework

Query neutralizer suppresses sycophantic bias

Contrastive decoding recalibrates output distributions

🔎 Similar Papers

Uncovering Bias in Large Vision-Language Models at Scale with Counterfactuals