Rethinking Chain-of-Thought from the Perspective of Self-Training

📅 2024-12-14
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address uncertainty, over-reasoning, and iterative redundancy in chain-of-thought (CoT) reasoning for large language models, this paper proposes a novel CoT framework based on a self-training paradigm. The framework comprises a task-specific prompting module and an adaptive iterative reasoning module, unifying CoT reasoning and self-training within a single modeling framework for the first time. It incorporates dynamic path pruning, confidence-driven iteration termination, and model-generated self-feedback to effectively suppress high step-wise similarity and spurious reasoning steps. Evaluated across multiple reasoning benchmarks, the method improves accuracy while simultaneously reducing average reasoning steps and computational overhead—achieving joint optimization of performance and efficiency.

Technology Category

Application Category

📝 Abstract
Chain-of-thought (CoT) reasoning has emerged as an effective approach for activating latent capabilities in LLMs. Interestingly, we observe that both CoT reasoning and self-training share the core objective: iteratively leveraging model-generated information to progressively reduce prediction uncertainty. Building on this insight, we propose a novel CoT framework to improve reasoning performance. Our framework integrates two key components: (i) a task-specific prompt module that optimizes the initial reasoning process, and (ii) an adaptive reasoning iteration module that dynamically refines the reasoning process and addresses the limitations of previous CoT approaches, ie over-reasoning and high similarity between consecutive reasoning iterations. Extensive experiments demonstrate that the proposed method achieves significant advantages in both performance and computational efficiency.
Problem

Research questions and friction points this paper is trying to address.

Large Language Models
Chained Reasoning
Efficiency Optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Chained Reasoning Method
Task-Guided Self-Adjustment
Enhanced Efficiency and Quality
🔎 Similar Papers
No similar papers found.
Z
Zongqian Wu
University of Electronic Science and Technology of China, SUTD
B
Baoduo Xu
University of Electronic Science and Technology of China
Ruochen Cui
Ruochen Cui
University of Electronic Science and Technology of China
Image RestorationImage DerainHallucination on MLLMs
M
Mengmeng Zhan
University of Electronic Science and Technology of China
X
Xiaofeng Zhu
University of Electronic Science and Technology of China
L
Lei Feng
SUTD