HulluEdit: Single-Pass Evidence-Consistent Subspace Editing for Mitigating Hallucinations in Large Vision-Language Models

📅 2026-02-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the critical issue of object hallucination in large vision-language models, which severely undermines their reliability, and proposes a novel intervention framework that operates in a single forward pass without requiring a reference model. By leveraging orthogonal subspace editing, the method decomposes hidden states into three orthogonal components—visual evidence, conflicting priors, and residual uncertainty—and selectively suppresses hallucination-inducing patterns. The approach provides a mathematical guarantee that modifications to the prior subspace do not interfere with visual evidence, thereby enabling efficient and evidence-consistent hallucination mitigation. Experimental results demonstrate state-of-the-art performance on the POPE and CHAIR benchmarks while preserving general capabilities on MME, significantly outperforming baselines such as contrastive decoding and static subspace editing.

Technology Category

Application Category

📝 Abstract
Object hallucination in Large Vision-Language Models (LVLMs) significantly hinders their reliable deployment. Existing methods struggle to balance efficiency and accuracy: they often require expensive reference models and multiple forward passes, or apply static edits that risk suppressing genuine visual evidence. To address this, we introduce HulluEdit, a single-pass, reference-free intervention framework. Our core innovation is orthogonal subspace editing: we decompose the hidden states of the model into orthogonal subspaces - visual evidence, conflicting priors, and residual uncertainty - enabling selective suppression of hallucinatory patterns without interfering with visual grounding. This approach mathematically guarantees that edits applied to the prior subspace leave the visual component entirely unaffected. Extensive experiments show that HulluEdit achieves state-of-the-art hallucination reduction on benchmarks including POPE and CHAIR across diverse architectures, while preserving general capabilities on MME and maintaining efficient inference. Our method consistently outperforms contrastive decoding and static subspace editing baselines, offering a new pathway toward more trustworthy LVLMs.
Problem

Research questions and friction points this paper is trying to address.

object hallucination
Large Vision-Language Models
visual grounding
hallucination mitigation
model reliability
Innovation

Methods, ideas, or system contributions that make the work stand out.

orthogonal subspace editing
hallucination mitigation
single-pass intervention
visual grounding preservation
reference-free framework
🔎 Similar Papers
No similar papers found.