Revisiting Anisotropy in Language Transformers: The Geometry of Learning Dynamics

📅 2026-04-09
📈 Citations: 0
Influential: 0
📄 PDF

career value

198K/year
🤖 AI Summary
This work addresses the anisotropy in the representation space of language Transformers, which undermines geometric interpretability. By integrating geometric analysis with concept-level mechanistic interpretability, the study introduces—during training—activation-derived low-rank tangential proxy directions and empirically demonstrates that tangential components dominate gradient energy distribution and drive anisotropy formation. Through frequency bias modeling and gradient energy decomposition, the authors show that these tangential directions significantly outperform same-rank random baselines in capturing anisotropic structure, consistently across both encoder and decoder architectures. These findings provide strong empirical support for a “tangential alignment” explanatory framework, highlighting the critical role of tangential dynamics in shaping the geometry of Transformer representations.

Technology Category

Application Category

📝 Abstract
Since their introduction, Transformer architectures have dominated Natural Language Processing (NLP). However, recent research has highlighted an inherent anisotropy phenomenon in these models, presenting a significant challenge to their geometric interpretation. Previous theoretical studies on this phenomenon are rarely grounded in the underlying representation geometry. In this paper, we extend them by deriving geometric arguments for how frequency-biased sampling attenuates curvature visibility and why training preferentially amplify tangent directions. Empirically, we then use concept-based mechanistic interpretability during training, rather than only post hoc, to fit activation-derived low-rank tangent proxies and test them against ordinary backpropagated true gradients. Across encoder-style and decoder-style language models, we find that these activation-derived directions capture both unusually large gradient energy and a substantially larger share of gradient anisotropy than matched-rank normal controls, providing strong empirical support for a tangent-aligned account of anisotropy.
Problem

Research questions and friction points this paper is trying to address.

anisotropy
Transformers
representation geometry
language models
gradient dynamics
Innovation

Methods, ideas, or system contributions that make the work stand out.

anisotropy
representation geometry
tangent directions
mechanistic interpretability
gradient dynamics
🔎 Similar Papers