Do Latent-CoT Models Think Step-by-Step? A Mechanistic Study on Sequential Reasoning Tasks

📅 2026-01-31

📈 Citations: 1

✨ Influential: 0

career value

197K/year

🤖 AI Summary

This study investigates whether Latent-CoT models, exemplified by CODI, genuinely perform implicit step-by-step reasoning. Employing interpretability techniques—including logit-lens decoding, linear probing, attention analysis, and activation patching—the work systematically examines how intermediate states are represented and propagated in polynomial iteration tasks. The analysis reveals, for the first time, that CODI constructs complete reasoning paths in short-hop tasks but shifts to relying on compressed shortcuts in long-hop settings, retaining only late-stage intermediate representations. This strategy proves highly fragile under distributional shifts or increased optimization difficulty, exposing a fundamental vulnerability in its reasoning process.

Technology Category

Application Category

📝 Abstract

Latent Chain-of-Thought (Latent-CoT) aims to enable step-by-step computation without emitting long rationales, yet its mechanisms remain unclear. We study CODI, a continuous-thought teacher-student distillation model, on strictly sequential polynomial-iteration tasks. Using logit-lens decoding, linear probes, attention analysis, and activation patching, we localize intermediate-state representations and trace their routing to the final readout. On two- and three-hop tasks, CODI forms the full set of bridge states that become decodable across latent-thought positions, while the final input follows a separate near-direct route; predictions arise via late fusion at the end-of-thought boundary. For longer hop lengths, CODI does not reliably execute a full latent rollout, instead exhibiting a partial latent reasoning path that concentrates on late intermediates and fuses them with the last input at the answer readout position. Ablations show that this partial pathway can collapse under regime shifts, including harder optimization. Overall, we delineate when CODI-style latent-CoT yields faithful iterative computation versus compressed or shortcut strategies, and highlight challenges in designing robust latent-CoT objectives for sequential reasoning.

Problem

Research questions and friction points this paper is trying to address.

Latent Chain-of-Thought

sequential reasoning

iterative computation

latent reasoning

mechanistic study

Innovation

Methods, ideas, or system contributions that make the work stand out.

Latent Chain-of-Thought

mechanistic interpretability

sequential reasoning