Mid-Think: Training-Free Intermediate-Budget Reasoning via Token-Level Triggers

📅 2026-01-11

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

Existing hybrid reasoning language models rely on high-level instructions to modulate reasoning behavior, yet their effectiveness is often dominated by specific tokens—such as “Okay”—hindering fine-grained control. This work reveals for the first time that reasoning behavior is driven by low-level, token-level triggers and identifies newline patterns as suppressors of reasoning. Building on this insight, the authors propose Mid-Think, a training-free prompt formatting strategy that enables flexible trade-offs between reasoning length and accuracy. Evaluated on Qwen3-8B, Mid-Think improves AIME accuracy from 69.8% to 72.4% and GPQA accuracy from 58.5% to 61.1%, while reducing reinforcement learning training time by approximately 15%.

Technology Category

Application Category

📝 Abstract

Hybrid reasoning language models are commonly controlled through high-level Think/No-think instructions to regulate reasoning behavior, yet we found that such mode switching is largely driven by a small set of trigger tokens rather than the instructions themselves. Through attention analysis and controlled prompting experiments, we show that a leading ``Okay''token induces reasoning behavior, while the newline pattern following ``''suppresses it. Based on this observation, we propose Mid-Think, a simple training-free prompting format that combines these triggers to achieve intermediate-budget reasoning, consistently outperforming fixed-token and prompt-based baselines in terms of the accuracy-length trade-off. Furthermore, applying Mid-Think to RL training after SFT reduces training time by approximately 15% while improving final performance of Qwen3-8B on AIME from 69.8% to 72.4% and on GPQA from 58.5% to 61.1%, demonstrating its effectiveness for both inference-time control and RL-based reasoning training.

Problem

Research questions and friction points this paper is trying to address.

intermediate-budget reasoning

token-level triggers

reasoning control

training-free prompting

accuracy-length trade-off

Innovation

Methods, ideas, or system contributions that make the work stand out.

token-level triggers

training-free reasoning

intermediate-budget reasoning