SIM-CoT: Supervised Implicit Chain-of-Thought

📅 2025-09-24

📈 Citations: 0

✨ Influential: 0

career value

156K/year

🤖 AI Summary

Implicit chain-of-thought (CoT) reasoning offers token efficiency but suffers from latent state degradation, representation homogenization, and training instability due to the absence of step-level supervision. To address this, we propose SIM-CoT—a novel plug-in framework introducing lightweight auxiliary decoders that align implicit reasoning tokens with explicit CoT steps, enabling fine-grained supervision in the latent space. During training, explicit CoT annotations enhance semantic diversity and optimization stability; during inference, the auxiliary modules are removed, preserving high token efficiency. SIM-CoT is the first method to identify and mitigate latent state degradation in implicit CoT, while also enabling interpretable, step-wise visualization of reasoning dynamics. Experiments show consistent gains: +8.2% accuracy over baselines on GPT-2 and +3.0% on LLaMA-3.1 8B. Moreover, SIM-CoT achieves 2.3× higher token efficiency than explicit CoT on GPT-2, substantially narrowing the performance gap between small and large language models.

Technology Category

Application Category

📝 Abstract

Implicit Chain-of-Thought (CoT) methods present a promising, token-efficient alternative to explicit CoT reasoning in Large Language Models (LLMs), but a persistent performance gap has limited the application of implicit CoT. We identify a core latent instability issue by scaling the computational budget of implicit CoT approaches: as we increase the number of implicit reasoning tokens to enhance performance, the training process often becomes unstable and collapses. Our analysis reveals that this instability arises from the latent representations becoming homogeneous and losing their semantic diversity, a failure caused by insufficient step-level supervision in existing implicit CoT approaches. To address this issue, we propose SIM-CoT, a plug-and-play training module that introduces step-level supervision to stabilize and enrich the latent reasoning space. Specifically, SIM-CoT employs an auxiliary decoder during training to align each implicit token with its corresponding explicit reasoning step, ensuring that latent states capture distinct and meaningful information. The proposed auxiliary decoder is removed during inference, preserving the computational efficiency of implicit CoT methods with no added overhead. In addition, the auxiliary decoder affords interpretability of implicit reasoning by projecting each latent token onto an explicit reasoning vocabulary, enabling per-step visualization of semantic roles and diagnosis. SIM-CoT significantly enhances both the in-domain accuracy and out-of-domain stability of various implicit CoT methods, boosting baselines like Coconut by +8.2% on GPT-2 and CODI by +3.0% on LLaMA-3.1 8B. Demonstrating strong scalability, SIM-CoT also surpasses the explicit CoT baseline on GPT-2 by 2.1% with 2.3 imes greater token efficiency, while substantially closing the performance gap on larger models like LLaMA-3.1 8B.

Problem

Research questions and friction points this paper is trying to address.

Implicit CoT methods suffer from performance gap due to instability

Training collapses when increasing implicit tokens due to homogeneous representations

Existing approaches lack step-level supervision causing semantic diversity loss

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces step-level supervision to stabilize training

Uses auxiliary decoder for latent token alignment

Removes decoder during inference for efficiency

🔎 Similar Papers

No similar papers found.