๐ค AI Summary
This work addresses the susceptibility of large language models to logical hallucinations and entropy drift during long-chain reasoning, where static guidance often leads to semantic dilution. The authors propose a lightweight, inference-time intervention framework that employs an adaptive dual-threshold mechanism to detect abrupt entropy surges in real time, thereby identifying logical errors. During structured reasoning stages, the method dynamically replaces the prior distribution by fusing historical high-confidence states to construct a reference distribution for precise correction. Integrating real-time entropy monitoring, dynamic null-prior replacement, and phase-aware guidance intensity modulation, the approach achieves an absolute accuracy improvement of 20.0% on the AIME25 benchmark, effectively mitigating uncontrolled entropy drift in complex reasoning tasks.
๐ Abstract
Large Language Models (LLMs) are prone to logical hallucinations and stochastic drifts during long-chain reasoning. While Classifier-Free Guidance (CFG) can improve instruction adherence, standard static implementations often cause semantic dilution and linguistic degradation. We propose SPREG (Structured Plan-guided Real-time Entropy Gating), a lightweight inference-time framework for surgical error rectification. SPREG employs an adaptive dual-threshold mechanism to monitor real-time entropy, identifying sudden ``entropy spikes'' as reliable indicators of logical failure. Upon detection, it triggers a dynamic repair by replacing uninformative null-priors with reference distributions synthesized from historical high-confidence states. By modulating guidance intensity according to structured reasoning stages (e.g., Action, Observation), SPREG steers the model back to a stable manifold without compromising fluency. Our experiments demonstrate significant gains, notably a 20.0% absolute accuracy improvement on AIME25, while effectively suppressing uncontrolled entropy drift in complex tasks.