Draft-Thinking: Learning Efficient Reasoning in Long Chain-of-Thought LLMs

📅 2026-02-28

📈 Citations: 0

✨ Influential: 0

career value

172K/year

🤖 AI Summary

This work proposes Draft-Thinking, a novel approach to mitigate the computational inefficiency of long-chain-of-thought (CoT) reasoning in large language models, which often leads to “overthinking” and excessive inference costs. Departing from prior methods that focus on post-hoc pruning or distillation, Draft-Thinking restructures the reasoning process itself by introducing a concise, draft-like inference architecture. This framework is trained via progressive curriculum learning to internalize essential reasoning steps efficiently and incorporates an adaptive prompting mechanism that enables dynamic adjustment of reasoning depth. Evaluated on benchmarks such as MATH500, the method achieves a remarkable 82.6% reduction in inference budget while incurring only a 2.6% performance drop, effectively decoupling reasoning efficiency from model accuracy.

Technology Category

Application Category

📝 Abstract

Long chain-of-thought~(CoT) has become a dominant paradigm for enhancing the reasoning capability of large reasoning models~(LRMs); however, the performance gains often come with a substantial increase in reasoning budget. Recent studies show that existing CoT paradigms tend to induce systematic overthinking, unnecessarily coupling reasoning capability with reasoning cost. Most prior approaches reduce token usage through post hoc techniques such as token compression, truncation, or length penalties, without explicitly addressing the core mechanisms of reasoning. We propose \textbf{Draft-Thinking}, which guides models to first learn a concise \textit{draft-style} reasoning structure that retains only the critical reasoning steps. Through a \textit{progressive curriculum learning}, the model stably internalizes this efficient reasoning pattern as its capability scales. Moreover, Draft-Thinking introduces adaptive prompting, which elevates reasoning depth to a flexible, model-selectable behavior. Extensive experiments demonstrate that Draft-Thinking substantially reduces reasoning budget while largely preserving reasoning performance; for example, on MATH500, it achieves an 82.6\% reduction in reasoning budget at the cost of only a 2.6\% performance drop.

Problem

Research questions and friction points this paper is trying to address.

Chain-of-Thought

reasoning efficiency

overthinking

reasoning cost

large reasoning models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Draft-Thinking

efficient reasoning

chain-of-thought