π€ AI Summary
This work proposes STARS, a novel framework that addresses the challenge of cognitive inertia in large reasoning models, which often leads to overthinking or rigid reasoning patterns that existing methods struggle to mitigate. STARS introduces an unsupervised approach to detect critical reasoning turning points by monitoring abrupt changes in the L2 distance of hidden states within the modelβs latent space. By integrating geometric trajectory analysis with state-aware linguistic prompts, the framework dynamically steers the modelβs reasoning path in real time. Notably, STARS operates without requiring additional training or fine-tuning, thereby overcoming the limitations of conventional methods that rely on surface-level textual heuristics. Experimental results demonstrate that STARS significantly reduces redundant reasoning loops and consistently improves reasoning accuracy across multiple benchmarks.
π Abstract
While Large Reasoning Models (LRMs) have achieved remarkable performance by scaling test-time compute, they frequently suffer from Cognitive Inertia, a failure pattern manifesting as either overthinking (inertia of motion) or reasoning rigidity (inertia of direction). Existing detection methods, typically relying on superficial textual heuristics like self-correction tokens, often fail to capture the model's unvoiced internal conflicts. To address this, we propose STARS (Spike-Triggered Adaptive Reasoning Steering), a training-free framework designed to rectify cognitive inertia by monitoring latent dynamics. STARS identifies Cognitive Pivots-critical moments of reasoning transition-by detecting distinct L2 distance spikes in the hidden states. Upon detection, the framework employs geometric trajectory analysis to diagnose the structural nature of the transition and injects state-aware language cues to steer the model in real-time. Our experiments across diverse benchmarks confirm that STARS efficiently curtails redundant loops while improving accuracy through the adaptive correction of erroneous trajectories. STARS offers a robust, unsupervised mechanism to optimize the reasoning process of LRMs without requiring additional fine-tuning.