🤖 AI Summary
Large reasoning models (LRMs) achieve strong performance but suffer from low token efficiency due to explicit chain-of-thought (CoT) prompting. This paper introduces “Chain-of-Unconscious-Thought” (CoUT), a novel paradigm that, for the first time, integrates psychological Unconscious Thought Theory (UTT) into LLM reasoning—guiding models to internalize reasoning within latent representations rather than generating explicit CoT tokens. Our approach comprises three key components: latent-layer thought prompting, lightweight token compression, and structured reasoning distillation, enabling end-to-end token-efficient inference. Evaluated across multiple benchmarks, CoUT matches CoT’s accuracy while reducing average token consumption by 47.62%, substantially improving inference efficiency without sacrificing performance. The implementation is publicly available.
📝 Abstract
Large Reasoning Models (LRMs) achieve promising performance but compromise token efficiency due to verbose reasoning processes. Unconscious Thought Theory (UTT) posits that complex problems can be solved more efficiently through internalized cognitive processes. Inspired by UTT, we propose a new reasoning paradigm, termed Chain of Unconscious Thought (CoUT), to improve the token efficiency of LRMs by guiding them to mimic human unconscious thought and internalize reasoning processes. Concretely, we first prompt the model to internalize the reasoning by thinking in the hidden layer. Then, we design a bag of token-efficient strategies to further help models reduce unnecessary tokens yet preserve the performance. Our work reveals that models may possess beneficial unconscious thought, enabling improved efficiency without sacrificing performance. Extensive experiments demonstrate the effectiveness of CoUT. Remarkably, it surpasses CoT by reducing token usage by 47.62% while maintaining comparable accuracy, as shown in Figure 1. The code of CoUT is available at this link: https://github.com/Rohan-GRH/CoUT