🤖 AI Summary
Explicit chain-of-thought (CoT) reasoning suffers from verbosity and inefficiency, making it difficult to compress without sacrificing performance or interpretability.
Method: We propose the first self-supervised framework that distills natural-language CoT into a continuous latent space, enabling joint modeling of explicit and implicit CoT with latent-state alignment. Our approach employs a shared-weight architecture incorporating three key components: (i) a latent-space alignment loss, (ii) continuous representation learning, and (iii) a differentiable thought decoding mechanism.
Contribution/Results: On GSM8k, our implicit CoT achieves performance on par with explicit CoT—surpassing prior state-of-the-art by 28.2% in accuracy—while attaining a 3.1× inference compression ratio. Crucially, it supports continuous thought decoding, yielding strong interpretability, cross-dataset generalization, and robust transferability. This work establishes a novel paradigm for efficient and interpretable neural reasoning.
📝 Abstract
Chain-of-Thought (CoT) enhances Large Language Models (LLMs) by enabling step-by-step reasoning in natural language. However, the language space may be suboptimal for reasoning. While implicit CoT methods attempt to enable reasoning without explicit CoT tokens, they have consistently lagged behind explicit CoT method in task performance. We propose CODI (Continuous Chain-of-Thought via Self-Distillation), a novel framework that distills CoT into a continuous space, where a shared model acts as both teacher and student, jointly learning explicit and implicit CoT while aligning their hidden activation on the token generating the final answer. CODI is the first implicit CoT method to match explicit CoT's performance on GSM8k while achieving 3.1x compression, surpassing the previous state-of-the-art by 28.2% in accuracy. Furthermore, CODI demonstrates scalability, robustness, and generalizability to more complex CoT datasets. Additionally, CODI retains interpretability by decoding its continuous thoughts, making its reasoning process transparent. Our findings establish implicit CoT as not only a more efficient but a powerful alternative to explicit CoT.