🤖 AI Summary
Large language models frequently exhibit systematic reasoning errors even on simple tasks, yet lack precise mechanisms for error attribution. This work proposes the first formal framework grounded in deterministic multi-tape Turing machines, decomposing the reasoning process into distinct components—input characters, tokens, parameters, activations, probability distributions, and outputs—thereby replacing vague geometric metaphors with a falsifiable theoretical foundation. The approach successfully identifies concrete failure modes, such as tokenization-induced disruption of character-level structure, elucidates the mechanisms and limitations of techniques like chain-of-thought prompting, and clarifies how externalized computation mitigates errors. By enabling rigorous, component-wise analysis of model behavior, this framework establishes a new paradigm for diagnosing systematic errors in large language models.
📝 Abstract
Large language models (LLMs) exhibit failure modes on seemingly trivial tasks. We propose a formalisation of LLM interaction using a deterministic multi-tape Turing machine, where each tape represents a distinct component: input characters, tokens, vocabulary, model parameters, activations, probability distributions, and output text. The model enables precise localisation of failure modes to specific pipeline stages, revealing, e.g., how tokenisation obscures character-level structure needed for counting tasks. The model clarifies why techniques like chain-of-thought prompting help, by externalising computation on the output tape, while also revealing their fundamental limitations. This approach provides a rigorous, falsifiable alternative to geometric metaphors and complements empirical scaling laws with principled error analysis.