Causal-HalBench: Uncovering LVLMs Object Hallucinations Through Causal Intervention

📅 2025-11-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large Vision-Language Models (LVLMs) suffer from pervasive object hallucination, primarily caused by spurious correlations between co-occurring objects in training data. Method: This work pioneers the application of causal inference to this problem, formalizing hallucination mechanisms via a structured causal model. We propose a systematic counterfactual sample generation method and introduce Causal-HalBench—the first causal benchmark enabling quantitative evaluation of spurious correlation effects. Our evaluation framework integrates structural causal modeling, counterfactual reasoning, text-to-image generation, and domain-specific LVLMs. Contribution/Results: Extensive experiments across state-of-the-art LVLMs demonstrate that all models exhibit significant susceptibility to spurious correlations. Crucially, causal interventions—grounded in counterfactual analysis—prove both effective and broadly applicable for hallucination detection and mitigation, establishing a principled foundation for robust LVLM development.

Technology Category

Application Category

📝 Abstract
Large Vision-Language Models (LVLMs) often suffer from object hallucination, making erroneous judgments about the presence of objects in images. We propose this primar- ily stems from spurious correlations arising when models strongly associate highly co-occurring objects during train- ing, leading to hallucinated objects influenced by visual con- text. Current benchmarks mainly focus on hallucination de- tection but lack a formal characterization and quantitative evaluation of spurious correlations in LVLMs. To address this, we introduce causal analysis into the object recognition scenario of LVLMs, establishing a Structural Causal Model (SCM). Utilizing the language of causality, we formally de- fine spurious correlations arising from co-occurrence bias. To quantify the influence induced by these spurious correla- tions, we develop Causal-HalBench, a benchmark specifically constructed with counterfactual samples and integrated with comprehensive causal metrics designed to assess model ro- bustness against spurious correlations. Concurrently, we pro- pose an extensible pipeline for the construction of these coun- terfactual samples, leveraging the capabilities of proprietary LVLMs and Text-to-Image (T2I) models for their genera- tion. Our evaluations on mainstream LVLMs using Causal- HalBench demonstrate these models exhibit susceptibility to spurious correlations, albeit to varying extents.
Problem

Research questions and friction points this paper is trying to address.

Quantifying spurious correlations causing object hallucinations in LVLMs
Developing causal metrics to assess model robustness against co-occurrence bias
Creating counterfactual benchmark to evaluate object recognition reliability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introducing causal analysis via Structural Causal Model
Developing Causal-HalBench with counterfactual samples
Creating pipeline using LVLMs and T2I models
🔎 Similar Papers
2024-10-06Conference on Empirical Methods in Natural Language ProcessingCitations: 33