TruthPrInt: Mitigating LVLM Object Hallucination Via Latent Truthful-Guided Pre-Intervention

📅 2025-03-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the pervasive object hallucination (OH) problem in large vision-language models (LVLMs). The authors identify that token-level hidden states in LVLMs exhibit high-specificity hallucination signatures and collectively align along a cross-model generalizable “reality direction” in latent space. Leveraging this insight, they propose ComnHallu—a transferable hallucination subspace alignment framework that performs inference-time latent-space intervention *prior to decoding* to detect and rectify hallucination tendencies. Crucially, the method requires no fine-tuning or additional annotations. Evaluated across multiple LVLM architectures and both in-domain and out-of-domain OH benchmarks, ComnHallu consistently surpasses state-of-the-art methods, achieving significant improvements in generation fidelity and generalization robustness.

Technology Category

Application Category

📝 Abstract
Object Hallucination (OH) has been acknowledged as one of the major trustworthy challenges in Large Vision-Language Models (LVLMs). Recent advancements in Large Language Models (LLMs) indicate that internal states, such as hidden states, encode the"overall truthfulness"of generated responses. However, it remains under-explored how internal states in LVLMs function and whether they could serve as"per-token"hallucination indicators, which is essential for mitigating OH. In this paper, we first conduct an in-depth exploration of LVLM internal states in relation to OH issues and discover that (1) LVLM internal states are high-specificity per-token indicators of hallucination behaviors. Moreover, (2) different LVLMs encode universal patterns of hallucinations in common latent subspaces, indicating that there exist"generic truthful directions"shared by various LVLMs. Based on these discoveries, we propose Truthful-Guided Pre-Intervention (TruthPrInt) that first learns the truthful direction of LVLM decoding and then applies truthful-guided inference-time intervention during LVLM decoding. We further propose ComnHallu to enhance both cross-LVLM and cross-data hallucination detection transferability by constructing and aligning hallucination latent subspaces. We evaluate TruthPrInt in extensive experimental settings, including in-domain and out-of-domain scenarios, over popular LVLMs and OH benchmarks. Experimental results indicate that TruthPrInt significantly outperforms state-of-the-art methods. Codes will be available at https://github.com/jinhaoduan/TruthPrInt.
Problem

Research questions and friction points this paper is trying to address.

Mitigating object hallucination in Large Vision-Language Models (LVLMs).
Exploring internal states as per-token hallucination indicators in LVLMs.
Proposing TruthPrInt for truthful-guided inference-time intervention in LVLMs.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses latent truthful-guided pre-intervention for LVLM decoding
Identifies universal hallucination patterns in common latent subspaces
Enhances hallucination detection with ComnHallu cross-LVLM alignment
🔎 Similar Papers
2024-10-06Conference on Empirical Methods in Natural Language ProcessingCitations: 33