🤖 AI Summary
To address the high computational overhead and poor generalizability of external-resource-dependent verification methods for large language model (LLM) reasoning paths, this paper proposes a lightweight, plug-and-play self-instructed verification approach. The core insight is the first identification that the rank of the correlation matrix between the input question and the LLM’s internal reasoning-path activation representations serves as a reliable intrinsic indicator of reasoning correctness. Unlike prior methods, ours requires no external verifiers, additional parameters, or complex prompting; instead, it computes the correlation matrix from forward-pass activations and analyzes its rank to perform unsupervised consistency discrimination—enabling reweighting and ranking of candidate reasoning paths. Evaluated across multiple models (Llama, Qwen, Phi-3) and diverse reasoning tasks (mathematical, commonsense, symbolic), our method achieves >75% path discrimination accuracy and an average performance gain of 8.2%, with negligible computational overhead.
📝 Abstract
Despite the strong reasoning ability of large language models~(LLMs), they are prone to errors and hallucinations. As a result, how to check their outputs effectively and efficiently has become a critical problem in their applications. Existing checking methods heavily rely on external resources, such as trained verifiers (e.g., process/outcome reward models) or elaborate prompts, which lead to high computational overhead and are only applicable to specific domains. In this paper, we investigate whether the internal behaviors of LLMs have already implied the credibility of their reasoning paths. Specifically, we find that the rank of the correlation matrix between the input problem and the output reasoning path is a robust indicator of reasoning correctness. Different from other correctness indicators for LLMs, the calculation of the correlation matrix only relies on the LLM itself, which avoids the hassle of training a separate model or designing complicated prompts. Based on it, we design a simple, plug-and-play Self-Indicator method to reweight candidate reasoning paths, which achieves significant performance improvements than other voting and verification methods with very few computational overhead. Our experiments across multiple LLMs of varying scales and model families have further shown the effectiveness of Self-Indicator. It achieves over 75% accuracy in distinguishing correct reasoning paths from incorrect ones, and, in turn, improves the accuracies on three reasoning benchmarks by more than 8%.