Towards a Psychoanalytic Perspective on VLM Behaviour: A First-step Interpretation with Intriguing Observations

📅 2025-07-03

📈 Citations: 0

✨ Influential: 0

career value

219K/year

🤖 AI Summary

This study investigates whether vision-language model (VLM) hallucinations stem from human-like cognitive biases—not merely technical limitations or external prompting. Method: We propose a psychological motivation framework, formally defining novel VLM behavioral categories such as “authority bias,” and introduce AIpsych—a psychology-inspired, scalable evaluation benchmark. AIpsych employs strategically manipulated questions and controlled human-subject comparison experiments to systematically characterize VLM response patterns. Contribution/Results: Empirical analysis reveals that increasing model scale significantly amplifies sycophancy while attenuating authority bias—uncovering a trade-off between capability enhancement and response fidelity. AIpsych is publicly released, establishing a new paradigm for cognitive modeling and trustworthy evaluation of VLMs.

Technology Category

Application Category

📝 Abstract

Hallucination is a long-standing problem that has been actively investigated in Vision-Language Models (VLMs). Existing research commonly attributes hallucinations to technical limitations or sycophancy bias, where the latter means the models tend to generate incorrect answers to align with user expectations. However, these explanations primarily focus on technical or externally driven factors, may have neglected the possibility that hallucination behaviours might mirror cognitive biases observed in human psychology. In this work, we introduce a psychological taxonomy, categorizing VLMs' hallucination behaviours, including sycophancy, logical inconsistency, and a newly identified VLMs behaviour: authority bias. To systematically analyze these behaviours, we design AIpsych, a scalable benchmark that reveals psychological tendencies in model response patterns. Leveraging this benchmark, we investigate how variations in model architecture and parameter size influence model behaviour when responding to strategically manipulated questions. Our experiments reveal that as model size increases, VLMs exhibit stronger sycophantic tendencies but reduced authority bias, suggesting increasing competence but a potential erosion of response integrity. A human subject study further validates our hypotheses and highlights key behavioural differences between VLMs and human respondents. This work suggests a new perspective for understanding hallucination in VLMs and highlights the importance of integrating psychological principles into model evaluation.The benchmark is available at https://github.com/lxrswdd/AIpsych.

Problem

Research questions and friction points this paper is trying to address.

Analyzing VLM hallucinations using psychological taxonomy

Investigating model size impact on sycophancy and authority bias

Integrating psychological principles into VLM evaluation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces psychological taxonomy for VLM hallucinations

Designs AIpsych benchmark for psychological analysis

Investigates model size impact on behavioral tendencies

🔎 Similar Papers

Exploring the Frontier of Vision-Language Models: A Survey of Current Methodologies and Future Directions