Logical Consistency as a Bridge: Improving LLM Hallucination Detection via Label Constraint Modeling between Responses and Self-Judgments

📅 2026-05-05

📈 Citations: 0

✨ Influential: 0

career value

179K/year

🤖 AI Summary

This work addresses the prevalent issue of factual hallucinations in large language models (LLMs), noting that existing detection approaches rely solely on either neural uncertainty or symbolic self-judgment while neglecting their interplay. To bridge this gap, the authors propose the Logic-aware Alignment via Bridge (LaaB) framework, which establishes a logical consistency link between model responses and self-judgments by mapping symbolic labels back into the feature space, thereby unifying neural and symbolic perspectives. The core innovation lies in a novel “meta-judgment” mechanism that leverages semantic constraints between responses and meta-judgments to align and mutually reinforce neural representations and symbolic reasoning. Integrating uncertainty quantification, self-judgment prompt engineering, and multi-view mutual learning, LaaB consistently outperforms eight baselines across four datasets and four LLMs, demonstrating strong effectiveness and generalization capability.

📝 Abstract

Large Language Models (LLMs) are prone to factual hallucinations, risking their reliability in real-world applications. Existing hallucination detectors mainly extract micro-level intrinsic patterns for uncertainty quantification or elicit macro-level self-judgments through verbalized prompts. However, these methods address only a single facet of the hallucination, focusing either on implicit neural uncertainty or explicit symbolic reasoning, thereby treating these inherently coupled behaviors in isolation and failing to exploit their interdependence for a holistic view. In this paper, we propose LaaB (Logical Consistency-as-a-Bridge), a framework that bridges neural features and symbolic judgments for hallucination detection. LaaB introduces a "meta-judgment" process to map symbolic labels back into the feature space. By leveraging the inherent logical bridge where response and meta-judgment labels are either the same or opposite based on the self-judgment's semantics, LaaB aligns and integrates dual-view signals via mutual learning and enhances the hallucination detection. Extensive experiments on 4 public datasets, across 4 LLMs, against 8 baselines demonstrate the superiority of LaaB.

Problem

Research questions and friction points this paper is trying to address.

hallucination detection

logical consistency

large language models

self-judgment

neural-symbolic integration

Innovation

Methods, ideas, or system contributions that make the work stand out.

logical consistency

hallucination detection

self-judgment