🤖 AI Summary
This study investigates the evolution of internal representations during chain-of-thought reasoning in large language models, aiming to understand, predict, and intervene in the correctness of their outputs. By modeling reasoning as structured trajectories in representation space, the work reveals for the first time that reasoning steps progress along specific subspaces in an ordered manner, with correct and incorrect solutions systematically diverging in later stages. Leveraging the geometric properties of these trajectories, the authors propose a novel inference-time trajectory-guidance paradigm. Integrating representational geometry analysis, trajectory clustering, and ROC-AUC evaluation, this approach enables high-accuracy prediction of final answer correctness as early as the midpoint of reasoning (achieving an AUC of 0.87) and supports dynamic correction and control over reasoning length.
📝 Abstract
This work characterizes large language models' chain-of-thought generation as a structured trajectory through representation space. We show that mathematical reasoning traverses functionally ordered, step-specific subspaces that become increasingly separable with layer depth. This structure already exists in base models, while reasoning training primarily accelerates convergence toward termination-related subspaces rather than introducing new representational organization. While early reasoning steps follow similar trajectories, correct and incorrect solutions diverge systematically at late stages. This late-stage divergence enables mid-reasoning prediction of final-answer correctness with ROC-AUC up to 0.87. Furthermore, we introduce trajectory-based steering, an inference-time intervention framework that enables reasoning correction and length control based on derived ideal trajectories. Together, these results establish reasoning trajectories as a geometric lens for interpreting, predicting, and controlling LLM reasoning behavior.