CoT Vectors: Transferring and Probing the Reasoning Mechanisms of LLMs

📅 2025-10-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high cost and low efficiency of in-context learning (ICL) and fine-tuning for multi-step reasoning tasks, this paper proposes CoT Vectors—a novel method that transfers and probes the internal reasoning mechanisms of large language models (LLMs) via compact vector representations. We first identify a systematic three-stage reasoning process common across LLMs and design task-agnostic, learnable CoT vectors trained lightly within a teacher–student framework to provide stable reasoning guidance. With only a minimal number of trainable parameters—far fewer than those required by parameter-efficient fine-tuning—CoT Vectors significantly outperform standard ICL across multiple reasoning benchmarks and approach the performance of state-of-the-art fine-tuning methods. Crucially, CoT vectors serve as structured probes that uncover latent organizational principles underlying LLM reasoning capabilities, establishing a new paradigm for efficient and interpretable reasoning enhancement.

Technology Category

Application Category

📝 Abstract
Chain-of-Thought (CoT) prompting has emerged as a powerful approach to enhancing the reasoning capabilities of Large Language Models (LLMs). However, existing implementations, such as in-context learning and fine-tuning, remain costly and inefficient. To improve CoT reasoning at a lower cost, and inspired by the task vector paradigm, we introduce CoT Vectors, compact representations that encode task-general, multi-step reasoning knowledge. Through experiments with Extracted CoT Vectors, we observe pronounced layer-wise instability, manifesting as a U-shaped performance curve that reflects a systematic three-stage reasoning process in LLMs. To address this limitation, we propose Learnable CoT Vectors, optimized under a teacher-student framework to provide more stable and robust guidance. Extensive evaluations across diverse benchmarks and models demonstrate that CoT Vectors not only outperform existing baselines but also achieve performance comparable to parameter-efficient fine-tuning methods, while requiring fewer trainable parameters. Moreover, by treating CoT Vectors as a probe, we uncover how their effectiveness varies due to latent space structure, information density, acquisition mechanisms, and pre-training differences, offering new insights into the functional organization of multi-step reasoning in LLMs. The source code will be released.
Problem

Research questions and friction points this paper is trying to address.

Improving Chain-of-Thought reasoning efficiency in LLMs
Addressing layer-wise instability in reasoning mechanisms
Probing latent space structure of multi-step reasoning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introducing compact CoT Vectors for reasoning knowledge
Optimizing Learnable CoT Vectors via teacher-student framework
Using CoT Vectors as probes to analyze reasoning mechanisms
🔎 Similar Papers
No similar papers found.
L
Li Li
School of Computer Science & Engineering, Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications (Southeast University), Ministry of Education, China
Z
Ziyi Wang
School of Computer Science & Engineering, Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications (Southeast University), Ministry of Education, China
Yongliang Wu
Yongliang Wu
Southeast University
Vision-Language Model
Jianfei Cai
Jianfei Cai
Professor of Data Science & AI, Monash University
Visual computingmultimediacomputer visionmultimedia networking
X
Xu Yang
School of Computer Science & Engineering, Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications (Southeast University), Ministry of Education, China