Symmetry Breaking in Transformers for Efficient and Interpretable Training

📅 2026-01-29

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

This work addresses the redundant rotational degrees of freedom inherent in the standard Transformer attention mechanism, which, while not affecting model outputs, complicate optimization and hinder interpretability. The authors propose the first application of symmetry breaking to Transformers by introducing fixed, non-learnable query and value biases generated via batch sampling, thereby embedding a preferred direction into the attention computation to break rotational symmetry. This minimal architectural modification substantially improves training efficiency with memory-efficient optimizers such as SGD with momentum (SGDM), narrowing the performance gap with adaptive optimizers. The approach achieves strong results on downstream logical reasoning tasks and validation loss, while simultaneously enhancing model interpretability by clearly amplifying the internal semantic structure within attention heads.

Technology Category

Application Category

📝 Abstract

The attention mechanism in its standard implementation contains extraneous rotational degrees of freedom that are carried through computation but do not affect model activations or outputs. We introduce a simple symmetry-breaking protocol that inserts a preferred direction into this rotational space through batchwise-sampled, unlearned query and value biases. This modification has two theoretically motivated and empirically validated consequences. First, it can substantially improve the performance of simple, memory-efficient optimizers, narrowing -- and in some cases closing -- the gap to successful but more complex memory-intensive adaptive methods. We demonstrate this by pretraining 124M parameter transformer models with four optimization algorithms (AdamW, SOAP, SGDM, and Energy Conserving Descent(ECD)) and evaluating both validation loss and downstream logical reasoning. Second, it enables an interpretable use of otherwise redundant rotational degrees of freedom, selectively amplifying semantically meaningful token classes within individual attention heads. Overall, our results show that minimal, principled architectural changes can simultaneously improve performance and interpretability.

Problem

Research questions and friction points this paper is trying to address.

symmetry breaking

attention mechanism

rotational degrees of freedom

training efficiency

interpretability

Innovation

Methods, ideas, or system contributions that make the work stand out.

symmetry breaking

attention mechanism

memory-efficient optimization