Efficient Reasoning via Thought-Training and Thought-Free Inference

📅 2025-11-05

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

Large language models (LLMs) exhibit suboptimal reasoning quality and efficiency under no-chain-of-thought (no-CoT) inference, where explicit step-by-step reasoning is omitted. Method: This paper proposes the 3TF framework, which first imparts implicit reasoning capability via structured thinking training, then enforces generation of concise, step-free answers through output constraints. It innovatively introduces a short-to-long training paradigm, integrating hybrid-mode training, fine-tuning on CoT-annotated data, and inference-time non-reasoning-mode constraints. Contribution/Results: Experiments demonstrate significant performance gains across multiple mainstream reasoning benchmarks under no-CoT settings. This work provides the first systematic empirical validation of high-quality implicit reasoning—establishing its feasibility and superiority over conventional approaches—and introduces a novel paradigm for efficient, lightweight LLM inference.

Technology Category

Application Category

📝 Abstract

Recent advances in large language models (LLMs) have leveraged explicit Chain-of-Thought (CoT) prompting to improve reasoning accuracy. However, most existing methods primarily compress verbose reasoning outputs. These Long-to-Short transformations aim to improve efficiency, but still rely on explicit reasoning during inference. In this work, we introduce extbf{3TF} ( extbf{T}hought- extbf{T}raining and extbf{T}hought- extbf{F}ree inference), a framework for efficient reasoning that takes a Short-to-Long perspective. We first train a hybrid model that can operate in both reasoning and non-reasoning modes, and then further train it on CoT-annotated data to internalize structured reasoning, while enforcing concise, thought-free outputs at inference time using the no-reasoning mode. Unlike compression-based approaches, 3TF improves the reasoning quality of non-reasoning outputs, enabling models to perform rich internal reasoning implicitly while keeping external outputs short. Empirically, 3TF-trained models obtain large improvements on reasoning benchmarks under thought-free inference, demonstrating that high quality reasoning can be learned and executed implicitly without explicit step-by-step generation.

Problem

Research questions and friction points this paper is trying to address.

Improving reasoning accuracy without explicit step-by-step generation

Enabling implicit rich reasoning while keeping outputs concise

Training models to internalize structured reasoning for efficient inference

Innovation

Methods, ideas, or system contributions that make the work stand out.

Training hybrid models with dual reasoning modes

Internalizing structured reasoning via CoT-annotated data

Executing implicit reasoning through thought-free inference

🔎 Similar Papers

No similar papers found.