Efficient Reasoning via Thought-Training and Thought-Free Inference

📅 2025-11-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) exhibit suboptimal reasoning quality and efficiency under no-chain-of-thought (no-CoT) inference, where explicit step-by-step reasoning is omitted. Method: This paper proposes the 3TF framework, which first imparts implicit reasoning capability via structured thinking training, then enforces generation of concise, step-free answers through output constraints. It innovatively introduces a short-to-long training paradigm, integrating hybrid-mode training, fine-tuning on CoT-annotated data, and inference-time non-reasoning-mode constraints. Contribution/Results: Experiments demonstrate significant performance gains across multiple mainstream reasoning benchmarks under no-CoT settings. This work provides the first systematic empirical validation of high-quality implicit reasoning—establishing its feasibility and superiority over conventional approaches—and introduces a novel paradigm for efficient, lightweight LLM inference.

Technology Category

Application Category

📝 Abstract
Recent advances in large language models (LLMs) have leveraged explicit Chain-of-Thought (CoT) prompting to improve reasoning accuracy. However, most existing methods primarily compress verbose reasoning outputs. These Long-to-Short transformations aim to improve efficiency, but still rely on explicit reasoning during inference. In this work, we introduce extbf{3TF} ( extbf{T}hought- extbf{T}raining and extbf{T}hought- extbf{F}ree inference), a framework for efficient reasoning that takes a Short-to-Long perspective. We first train a hybrid model that can operate in both reasoning and non-reasoning modes, and then further train it on CoT-annotated data to internalize structured reasoning, while enforcing concise, thought-free outputs at inference time using the no-reasoning mode. Unlike compression-based approaches, 3TF improves the reasoning quality of non-reasoning outputs, enabling models to perform rich internal reasoning implicitly while keeping external outputs short. Empirically, 3TF-trained models obtain large improvements on reasoning benchmarks under thought-free inference, demonstrating that high quality reasoning can be learned and executed implicitly without explicit step-by-step generation.
Problem

Research questions and friction points this paper is trying to address.

Improving reasoning accuracy without explicit step-by-step generation
Enabling implicit rich reasoning while keeping outputs concise
Training models to internalize structured reasoning for efficient inference
Innovation

Methods, ideas, or system contributions that make the work stand out.

Training hybrid models with dual reasoning modes
Internalizing structured reasoning via CoT-annotated data
Executing implicit reasoning through thought-free inference
🔎 Similar Papers
No similar papers found.
C
Canhui Wu
Xi’an Jiaotong University, JD Future Academy
Qiong Cao
Qiong Cao
JD Exploration Academy, JD.com
Computer Vision3D Human-centric VisionMachine Learning
Chao Xue
Chao Xue
Beihang University
Natural Language ProcessingLarge Language Model
W
Wei Xi
Xi’an Jiaotong University
X
Xiaodong He
JD Future Academy