Post-Routing Arithmetic in Llama-3: Last-Token Result Writing and Rotation-Structured Digit Directions

📅 2026-02-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates how Llama-3, after performing cross-token information routing, can produce the result of three-digit addition relying solely on the final input token. Through causal residual patching and cumulative attention ablation, the authors identify a post-routing boundary around layer 17, beyond which the addition output is governed by the last token and structured numerical directions residing in a low-rank subspace. The work presents the first evidence that numerical directions across different carry contexts are related via approximately orthogonal rotational mappings, enabling precise counterfactual digit editing through rotation-based interventions. Experiments demonstrate that late self-attention layers can be omitted without performance loss, and that cross-context editing capability can be restored via low-rank Procrustes alignment and directional manipulation. Negative control experiments fail to reproduce this effect, confirming the specificity of the identified mechanism.

Technology Category

Application Category

📝 Abstract
We study three-digit addition in Meta-Llama-3-8B (base) under a one-token readout to characterize how arithmetic answers are finalized after cross-token routing becomes causally irrelevant. Causal residual patching and cumulative attention ablations localize a sharp boundary near layer~17: beyond it, the decoded sum is controlled almost entirely by the last input token and late-layer self-attention is largely dispensable. In this post-routing regime, digit(-sum) direction dictionaries vary with a next-higher-digit context but are well-related by an approximately orthogonal map inside a shared low-rank subspace (low-rank Procrustes alignment). Causal digit editing matches this geometry: naive cross-context transfer fails, while rotating directions through the learned map restores strict counterfactual edits; negative controls do not recover.
Problem

Research questions and friction points this paper is trying to address.

post-routing arithmetic
last-token readout
digit direction
low-rank subspace
causal editing
Innovation

Methods, ideas, or system contributions that make the work stand out.

post-routing arithmetic
low-rank Procrustes alignment
digit direction geometry
causal residual patching
counterfactual editing
🔎 Similar Papers
No similar papers found.