🤖 AI Summary
This work addresses the suboptimal performance of large language models on complex reasoning tasks, often attributed to unreliable internal activations. While existing approaches rely on costly post-training or extensive sampling, this paper introduces AdaRAS—a lightweight and efficient framework that identifies and leverages “reasoning-critical neurons” (RCNs), whose activations are highly correlated with reasoning correctness. AdaRAS adaptively modulates these neurons’ activations at test time using a polarity-aware mean-shift criterion, without requiring any additional training or sampling. The method demonstrates strong cross-model and cross-dataset transferability, achieving significant performance gains across ten mathematical and programming benchmarks. Notably, it improves scores on AIME-24 and AIME-25 by over 13%, outperforming conventional post-training strategies.
📝 Abstract
Despite the strong reasoning capabilities of recent large language models (LLMs), achieving reliable performance on challenging tasks often requires post-training or computationally expensive sampling strategies, limiting their practical efficiency. In this work, we first show that a small subset of neurons in LLMs exhibits strong predictive correlations with reasoning correctness. Based on this observation, we propose AdaRAS (Adaptive Reasoning Activation Steering), a lightweight test-time framework that improves reasoning reliability by selectively intervening on neuron activations. AdaRAS identifies Reasoning-Critical Neurons (RCNs) via a polarity-aware mean-difference criterion and adaptively steers their activations during inference, enhancing incorrect reasoning traces while avoiding degradation on already-correct cases. Experiments on 10 mathematics and coding benchmarks demonstrate consistent improvements, including over 13% gains on AIME-24 and AIME-25. Moreover, AdaRAS exhibits strong transferability across datasets and scalability to stronger models, outperforming post-training methods without additional training or sampling cost.