Chain-of-Thought Tokens are Computer Program Variables

📅 2025-05-08

📈 Citations: 0

✨ Influential: 0

career value

153K/year

🤖 AI Summary

This paper investigates the intrinsic mechanisms of chain-of-thought (CoT) tokens in large language models (LLMs), focusing on combinatorial reasoning tasks such as multi-digit multiplication and dynamic programming. Through causal interventions—including intermediate-result masking, latent representation substitution, and dynamic response tracing—we demonstrate that CoT tokens fundamentally encode program-variable semantics: retaining only tokens storing intermediate results preserves over 98% of task performance, and arbitrary perturbations to these values consistently propagate through subsequent reasoning steps with minimal final-answer degradation (<2% accuracy loss). We formally identify and empirically validate three core properties of CoT tokens—intervenability, state dependence, and latent shortcut vulnerabilities—thereby moving beyond black-box interpretations of CoT reasoning. These findings establish a mechanistic foundation for understanding how LLMs perform structured computation. To support reproducible research, we publicly release all code and benchmark datasets.

Technology Category

Application Category

📝 Abstract

Chain-of-thoughts (CoT) requires large language models (LLMs) to generate intermediate steps before reaching the final answer, and has been proven effective to help LLMs solve complex reasoning tasks. However, the inner mechanism of CoT still remains largely unclear. In this paper, we empirically study the role of CoT tokens in LLMs on two compositional tasks: multi-digit multiplication and dynamic programming. While CoT is essential for solving these problems, we find that preserving only tokens that store intermediate results would achieve comparable performance. Furthermore, we observe that storing intermediate results in an alternative latent form will not affect model performance. We also randomly intervene some values in CoT, and notice that subsequent CoT tokens and the final answer would change correspondingly. These findings suggest that CoT tokens may function like variables in computer programs but with potential drawbacks like unintended shortcuts and computational complexity limits between tokens. The code and data are available at https://github.com/solitaryzero/CoTs_are_Variables.

Problem

Research questions and friction points this paper is trying to address.

Understanding the inner mechanism of Chain-of-Thought tokens in LLMs

Evaluating the impact of preserving only intermediate result tokens

Exploring if CoT tokens function like computer program variables

Innovation

Methods, ideas, or system contributions that make the work stand out.

CoT tokens act like program variables

Preserving intermediate results maintains performance

Alternative latent forms do not affect outcomes

🔎 Similar Papers

Beyond Chain-of-Thought: A Survey of Chain-of-X Paradigms for LLMs