Towards Analyzing and Mitigating Sycophancy in Large Vision-Language Models

📅 2024-08-21
🏛️ arXiv.org
📈 Citations: 4
Influential: 0
📄 PDF
🤖 AI Summary
Large Vision-Language Models (LVLMs) exhibit pervasive “sycophancy”—an overreliance on leading or deceptive prompts—resulting in output bias and hallucination. This work is the first to systematically reveal LVLMs’ widespread lack of sycophancy robustness across diverse multimodal tasks. We propose Leading Query Contrastive Decoding (LQCD), a model-agnostic, decoding-time method that dynamically calibrates reliance on guiding cues via text-based contrastive analysis and probability suppression. LQCD requires no model fine-tuning, is compatible with any LVLM, and significantly reduces sycophantic bias and hallucination across multiple benchmark leading queries—outperforming existing prompt-engineering and hallucination-mitigation approaches. Moreover, it improves response quality for neutral queries without compromising task performance. The method demonstrates strong generalizability, empirical effectiveness, and deployment efficiency, offering a practical, plug-and-play solution for enhancing LVLM reliability in real-world applications.

Technology Category

Application Category

📝 Abstract
Large Vision-Language Models (LVLMs) have shown significant capability in vision-language understanding. However, one critical issue that persists in these models is sycophancy, which means models are unduly influenced by leading or deceptive prompts, resulting in biased outputs and hallucinations. Despite the progress in LVLMs, evaluating and mitigating sycophancy is yet much under-explored. In this work, we fill this gap by systematically analyzing sycophancy on various VL benchmarks with curated leading queries and further proposing a text contrastive decoding method for mitigation. While the specific sycophantic behavior varies significantly among models, our analysis reveals the severe deficiency of all LVLMs in resilience of sycophancy across various tasks. For improvement, we propose Leading Query Contrastive Decoding (LQCD), a model-agnostic method focusing on calibrating the LVLMs' over-reliance on leading cues by identifying and suppressing the probabilities of sycophancy tokens at the decoding stage. Extensive experiments show that LQCD effectively mitigate sycophancy, outperforming both prompt engineering methods and common methods for hallucination mitigation. We further demonstrate that LQCD does not hurt but even slightly improves LVLMs' responses to neutral queries, suggesting it being a more effective strategy for general-purpose decoding but not limited to sycophancy.
Problem

Research questions and friction points this paper is trying to address.

Analyzing sycophancy in Vision-Language Models (LVLMs)
Mitigating prompt-induced bias in LVLMs outputs
Developing inference-time framework for robust multimodal reasoning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Training-free model-agnostic inference framework
Query neutralizer suppresses sycophantic bias
Contrastive decoding recalibrates output distributions
Yunpu Zhao
Yunpu Zhao
University of Science and Technology of China
Large Vision-Language ModelsDeep LearningCognitive ScienceComputer Vision
R
Rui Zhang
State Key Lab of Processors, Institute of Computing Technology, China
Junbin Xiao
Junbin Xiao
National University of Singapore
Video and LanguageEmbodied InteractionTrustworthy Multimodality
C
Changxin Ke
University of Chinese Academy of Sciences, China
Ruibo Hou
Ruibo Hou
UIUC
NLPAI4SCIENCE
Y
Yifan Hao
State Key Lab of Processors, Institute of Computing Technology, China
Q
Qi Guo
State Key Lab of Processors, Institute of Computing Technology, China
Yunji Chen
Yunji Chen
Institute of Computing Technology, Chinese Academy of Sciences
processor architecturemicroarchitecturemachine learning