Thinking Without Words: Efficient Latent Reasoning with Abstract Chain-of-Thought

πŸ“… 2026-04-24
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

191K/year
πŸ€– AI Summary
This work proposes Abstract Chain-of-Thought (ACoT), a novel approach that replaces natural language reasoning chains with sequences of discrete latent variables to achieve efficient and effective reasoning. Addressing the high computational cost of traditional explicit chain-of-thought methods and the performance degradation of existing non-linguistic compression techniques, ACoT integrates vocabulary-preserving discrete latents, mask-supervised fine-tuning, constrained decoding with self-distillation, and warm-start reinforcement learning. The model learns to reason using compact abstract symbols while maintaining strong task performance. Experiments demonstrate up to 11.6Γ— reduction in reasoning tokens on mathematical reasoning, instruction following, and multi-hop tasks, matching the accuracy of explicit chain-of-thought methods and exhibiting strong cross-model generalization. Notably, the learned abstract symbols follow a power-law distribution reminiscent of natural language.

Technology Category

Application Category

πŸ“ Abstract
While long, explicit chains-of-thought (CoT) have proven effective on complex reasoning tasks, they are costly to generate during inference. Non-verbal reasoning methods have emerged with shorter generation lengths by leveraging continuous representations, yet their performance lags behind verbalized CoT. We propose $\textbf{Abstract Chain-of-Thought}$, a discrete latent reasoning post-training mechanism in which the language model produces a short sequence of tokens from a reserved vocabulary in lieu of a natural language CoT, before generating a response. To make previously unseen ''abstract'' tokens useful, we introduce a policy iteration-style warm-up loop that alternates between (i.) bottlenecking from a verbal CoT via masking and performing supervised fine-tuning, and (ii.) self-distillation by training the model to generate abstract tokens from the prompt alone via constrained decoding with the codebook. After warm-up, we optimize the generation of abstract sequences with warm-started reinforcement learning under constrained decoding. Abstract-CoT achieves up to $11.6\times$ fewer reasoning tokens while demonstrating comparable performance across mathematical reasoning, instruction-following, and multi-hop reasoning, and generalizes across language model families. We also find an emergent power law distribution over the abstract vocabulary, akin to those seen in natural language, that evolves across the training phases. Our findings highlight the potential for post-training latent reasoning mechanisms that enable efficient inference through a learned abstract reasoning language.
Problem

Research questions and friction points this paper is trying to address.

Chain-of-Thought
Latent Reasoning
Efficient Inference
Abstract Reasoning
Token Efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Abstract Chain-of-Thought
latent reasoning
discrete tokens
constrained decoding
post-training
πŸ”Ž Similar Papers