A distributional simplicity bias in the learning dynamics of transformers

📅 2024-10-25

🏛️ Neural Information Processing Systems

📈 Citations: 4

✨ Influential: 0

🤖 AI Summary

This work investigates whether self-supervised Transformer pretraining exhibits a distribution-level simplicity bias—i.e., whether it preferentially learns low-order token interactions first, then progressively captures higher-order nonlinear dependencies, thereby enabling strong generalization. To probe this, the authors introduce an interaction-order-controlled data cloning method, enabling the first empirical demonstration of a “simple-to-complex” phased learning pattern during pretraining: low-order interactions converge rapidly and saturate in error, while high-order interactions emerge later. Integrating many-body information entropy estimation with training dynamics trajectory analysis, they construct an interpretable diagnostic framework. Key contributions are: (1) identification and validation of a distribution-level simplicity bias; (2) establishment of a causal, interaction-order-controllable analytical paradigm for probing representational learning; and (3) provision of novel, quantifiable evidence elucidating the generalization mechanisms of large language models.

Technology Category

Application Category

📝 Abstract

The remarkable capability of over-parameterised neural networks to generalise effectively has been explained by invoking a ``simplicity bias'': neural networks prevent overfitting by initially learning simple classifiers before progressing to more complex, non-linear functions. While simplicity biases have been described theoretically and experimentally in feed-forward networks for supervised learning, the extent to which they also explain the remarkable success of transformers trained with self-supervised techniques remains unclear. In our study, we demonstrate that transformers, trained on natural language data, also display a simplicity bias. Specifically, they sequentially learn many-body interactions among input tokens, reaching a saturation point in the prediction error for low-degree interactions while continuing to learn high-degree interactions. To conduct this analysis, we develop a procedure to generate extit{clones} of a given natural language data set, which rigorously capture the interactions between tokens up to a specified order. This approach opens up the possibilities of studying how interactions of different orders in the data affect learning, in natural language processing and beyond.

Problem

Research questions and friction points this paper is trying to address.

Examines simplicity bias in transformers' learning dynamics.

Analyzes sequential learning of token interactions.

Develops method to study data interaction effects.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformers exhibit simplicity bias

Developed dataset clones for analysis

Studied token interaction orders

🔎 Similar Papers

No similar papers found.

Authors to Follow