π€ AI Summary
This work systematically compares Chain-of-Thought (CoT) and Latent Thoughtβtwo distinct reasoning paradigms for large language models. Addressing the lack of formal analysis on their mechanistic differences, we propose the first unified modeling framework that integrates recurrent Transformer architectures with continuous latent-space dynamics. Within this framework, we theoretically characterize and empirically evaluate both paradigms along three dimensions: computational parallelizability, decoding strategies, and task adaptability. Our analysis reveals that CoT relies on sequential stochastic sampling, making it suitable for complex reasoning tasks requiring approximate solutions; in contrast, Latent Thought enables fully parallelized, layer-wise evolution of implicit states, yielding superior computational efficiency and long-range dependency modeling. These findings provide a verifiable theoretical foundation for principled paradigm selection in model-based reasoning. The implementation is publicly available.
π Abstract
Chain-of-Thought (CoT) elicits reasoning in large language models by explicitly generating intermediate steps in natural language. In contrast, Latent Thought in looped models operates directly in the continuous latent space, enabling computation beyond discrete linguistic representations. While both approaches exploit iterative computation, their comparative capabilities remain underexplored. In this work, we present a formal analysis showing that Latent Thought in Looped Transformers enables parallel computation, which is more efficient than the inherently sequential process of CoT. In contrast, CoT leverages stochastic decoding to approximate solutions to problems where exact computation is intractable. These separations suggest the tasks for which depth-driven recursion is more suitable, thereby offering practical guidance for choosing between reasoning paradigms. Code is available at https://github.com/kevin671/cot-vs-loop.