CDH-Bench: A Commonsense-Driven Hallucination Benchmark for Evaluating Visual Fidelity in Vision-Language Models

📅 2026-03-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the tendency of vision-language models to generate “commonsense-driven hallucinations”—overreliance on prior knowledge at the expense of visual evidence—when confronted with conflicting image content. The study formally defines and quantifies this phenomenon, introducing CDH-Bench, a novel benchmark comprising anomalous scenarios across counting, relational, and attribute-based tasks. It proposes a multidimensional conflict testing framework alongside new evaluation metrics, including counterfactual accuracy and commonsense collapse rate. Through binary classification and multiple-choice question-answering tasks, the paper systematically evaluates the visual fidelity of state-of-the-art models. Experimental results reveal a pervasive failure among current models to prioritize visual input over ingrained priors, thereby demonstrating the necessity and effectiveness of the proposed benchmark for diagnosing and advancing model robustness.
📝 Abstract
Vision-language models (VLMs) achieve strong performance on many benchmarks, yet a basic reliability question remains underexplored: when visual evidence conflicts with commonsense, do models follow what is shown or what commonsense suggests? A characteristic failure in this setting is that the model overrides visual evidence and outputs the commonsense alternative. We term this phenomenon \textbf{commonsense-driven hallucination} (CDH). To evaluate it, we introduce \textbf{CDH-Bench}, a benchmark designed to create explicit \textbf{visual evidence--commonsense conflicts}. CDH-Bench covers three dimensions: \textit{counting anomalies}, \textit{relational anomalies}, and \textit{attribute anomalies}. We evaluate frontier VLMs under \textit{binary Question Answering (QA)} and \textit{multiple-choice QA}, and report metrics including \textit{Counterfactual Accuracy} (CF-Acc), \textit{Commonsense Accuracy} (CS-Acc), \textit{Counterfactual Accuracy Drop} (CFAD), \textit{Commonsense Collapse Rate} (CCR), and \textit{Relative Prior Dependency} (RPD). Results show that even strong models remain vulnerable to prior-driven normalization under visual evidence--commonsense conflict. CDH-Bench provides a controlled diagnostic of visual fidelity under visual evidence--commonsense conflict.
Problem

Research questions and friction points this paper is trying to address.

commonsense-driven hallucination
visual fidelity
vision-language models
visual evidence--commonsense conflict
hallucination benchmark
Innovation

Methods, ideas, or system contributions that make the work stand out.

commonsense-driven hallucination
visual fidelity
vision-language models
CDH-Bench
visual evidence--commonsense conflict
🔎 Similar Papers
No similar papers found.
K
Kesheng Chen
Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies, Institute of Cyberspace Security, School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, China
Y
Yamin Hu
Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies, Institute of Cyberspace Security, School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, China
Q
Qi Zhou
Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies, Institute of Cyberspace Security, School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, China
Z
Zhenqian Zhu
Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies, Institute of Cyberspace Security, School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, China
Wenjian Luo
Wenjian Luo
Professor, School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen
AI and SecurityIntelligent SecuritySecure IntelligencePrivacy ComputationImmune Computation