π€ AI Summary
This work addresses the prevalent issue of factual hallucinations in large language models (LLMs), noting that existing detection approaches rely solely on either neural uncertainty or symbolic self-judgment while neglecting their interplay. To bridge this gap, the authors propose the Logic-aware Alignment via Bridge (LaaB) framework, which establishes a logical consistency link between model responses and self-judgments by mapping symbolic labels back into the feature space, thereby unifying neural and symbolic perspectives. The core innovation lies in a novel βmeta-judgmentβ mechanism that leverages semantic constraints between responses and meta-judgments to align and mutually reinforce neural representations and symbolic reasoning. Integrating uncertainty quantification, self-judgment prompt engineering, and multi-view mutual learning, LaaB consistently outperforms eight baselines across four datasets and four LLMs, demonstrating strong effectiveness and generalization capability.
π Abstract
Large Language Models (LLMs) are prone to factual hallucinations, risking their reliability in real-world applications. Existing hallucination detectors mainly extract micro-level intrinsic patterns for uncertainty quantification or elicit macro-level self-judgments through verbalized prompts. However, these methods address only a single facet of the hallucination, focusing either on implicit neural uncertainty or explicit symbolic reasoning, thereby treating these inherently coupled behaviors in isolation and failing to exploit their interdependence for a holistic view. In this paper, we propose LaaB (Logical Consistency-as-a-Bridge), a framework that bridges neural features and symbolic judgments for hallucination detection. LaaB introduces a "meta-judgment" process to map symbolic labels back into the feature space. By leveraging the inherent logical bridge where response and meta-judgment labels are either the same or opposite based on the self-judgment's semantics, LaaB aligns and integrates dual-view signals via mutual learning and enhances the hallucination detection. Extensive experiments on 4 public datasets, across 4 LLMs, against 8 baselines demonstrate the superiority of LaaB.