Reasoning Can Be Restored by Correcting a Few Decision Tokens

📅 2026-05-16

📈 Citations: 0

✨ Influential: 0

career value

160K/year

🤖 AI Summary

This work addresses the tendency of base large language models to fail on complex reasoning tasks due to erroneous early decisions at critical tokens, a failure mode that is often difficult to localize. By quantifying token-level distributional discrepancies between base models and stronger reasoning-capable models, the authors identify a sparse set of early, planning-relevant tokens exhibiting high divergence. They propose a sparse intervention strategy that switches generation to the stronger model only at these critical positions—covering approximately 8% of tokens—guided by likelihood-driven distributional divergence analysis and runtime model coordination. This approach achieves substantial performance recovery, even surpassing dedicated reasoning models of comparable size, while incurring minimal computational overhead.

📝 Abstract

Large reasoning models (LRMs) substantially outperform their base LLM counterparts on challenging reasoning benchmarks, yet it remains poorly understood where base models go wrong during token-by-token generation and how to narrow this gap efficiently. We study the base-reasoning gap through quantifying token-level distributional disagreement between a base model and a stronger reasoning model using likelihood-based divergences. Across benchmarks, we find that the reasoning advantage is highly sparse and concentrates on a small set of early, planning-related decision tokens. For instance, on Qwen3-0.6B, only ~8% of generated tokens account for the salient disagreement, and these tokens concentrate early in the response, are strongly enriched in planning-related decisions (17x), and coincide with high base-model uncertainty -- suggesting that base models fail mainly at early planning points that steer the subsequent reasoning trajectory. Building on these findings, we propose disagreement-guided token intervention, a simple inference-time delegation scheme that performs a one-token takeover by the reasoning model only at high-disagreement positions and immediately switches back to the base model. With a small intervention budget, this sparse delegation substantially recovers and can even surpass the performance of a same-size reasoning model on challenging reasoning tasks. Code is available at https://github.com/AlphaLab-USTC/RRTokenIntervention.

Problem

Research questions and friction points this paper is trying to address.

reasoning gap

token-level disagreement

decision tokens

planning failure

base-reasoning gap

Innovation

Methods, ideas, or system contributions that make the work stand out.

token-level intervention

reasoning models

distributional disagreement