Outlier dimensions favor frequent tokens in language model

📅 2025-03-27

📈 Citations: 0

✨ Influential: 0

career value

164K/year

🤖 AI Summary

This study identifies a pervasive phenomenon—“anomalous dimensions”—in the final layer of large language models (LLMs): specific neuron dimensions exhibiting extreme activations for high-frequency tokens. To characterize this phenomenon, we employ activation statistics, dimension-level attribution, weight decomposition, and training dynamics tracking. We find that anomalous dimensions implement a learnable, controllable heuristic for predicting frequent tokens; they emerge mid-training in critical modules such as QKV projections and rely on compensatory weight suppression from other dimensions to enable contextual adaptation. This work is the first to systematically identify and interpret anomalous dimensions as a cross-model, interpretable structural motif. We quantify their dominant contribution to high-frequency token prediction, elucidate their parameter dependencies and gating principles, and establish a novel paradigm for understanding internal representations and enabling targeted, controllable interventions in language models.

Technology Category

Application Category

📝 Abstract

We study last-layer outlier dimensions, i.e.dimensions that display extreme activations for the majority of inputs. We show that outlier dimensions arise in many different modern language models, and trace their function back to the heuristic of constantly predicting frequent words. We further show how a model can block this heuristic when it is not contextually appropriate, by assigning a counterbalancing weight mass to the remaining dimensions, and we investigate which model parameters boost outlier dimensions and when they arise during training. We conclude that outlier dimensions are a specialized mechanism discovered by many distinct models to implement a useful token prediction heuristic.

Problem

Research questions and friction points this paper is trying to address.

Identify outlier dimensions in language models

Understand their role in frequent token prediction

Investigate mechanisms to counterbalance outlier effects

Innovation

Methods, ideas, or system contributions that make the work stand out.

Outlier dimensions handle extreme activations

Counterbalancing weights block inappropriate heuristics

Specialized mechanism for token prediction

🔎 Similar Papers

No similar papers found.