A distributional simplicity bias in the learning dynamics of transformers

๐Ÿ“… 2024-10-25
๐Ÿ›๏ธ Neural Information Processing Systems
๐Ÿ“ˆ Citations: 4
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work investigates whether self-supervised Transformer pretraining exhibits a distribution-level simplicity biasโ€”i.e., whether it preferentially learns low-order token interactions first, then progressively captures higher-order nonlinear dependencies, thereby enabling strong generalization. To probe this, the authors introduce an interaction-order-controlled data cloning method, enabling the first empirical demonstration of a โ€œsimple-to-complexโ€ phased learning pattern during pretraining: low-order interactions converge rapidly and saturate in error, while high-order interactions emerge later. Integrating many-body information entropy estimation with training dynamics trajectory analysis, they construct an interpretable diagnostic framework. Key contributions are: (1) identification and validation of a distribution-level simplicity bias; (2) establishment of a causal, interaction-order-controllable analytical paradigm for probing representational learning; and (3) provision of novel, quantifiable evidence elucidating the generalization mechanisms of large language models.

Technology Category

Application Category

๐Ÿ“ Abstract
The remarkable capability of over-parameterised neural networks to generalise effectively has been explained by invoking a ``simplicity bias'': neural networks prevent overfitting by initially learning simple classifiers before progressing to more complex, non-linear functions. While simplicity biases have been described theoretically and experimentally in feed-forward networks for supervised learning, the extent to which they also explain the remarkable success of transformers trained with self-supervised techniques remains unclear. In our study, we demonstrate that transformers, trained on natural language data, also display a simplicity bias. Specifically, they sequentially learn many-body interactions among input tokens, reaching a saturation point in the prediction error for low-degree interactions while continuing to learn high-degree interactions. To conduct this analysis, we develop a procedure to generate extit{clones} of a given natural language data set, which rigorously capture the interactions between tokens up to a specified order. This approach opens up the possibilities of studying how interactions of different orders in the data affect learning, in natural language processing and beyond.
Problem

Research questions and friction points this paper is trying to address.

Examines simplicity bias in transformers' learning dynamics.
Analyzes sequential learning of token interactions.
Develops method to study data interaction effects.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformers exhibit simplicity bias
Developed dataset clones for analysis
Studied token interaction orders
๐Ÿ”Ž Similar Papers
No similar papers found.
R
Riccardo Rende
International School for Advanced Studies, Trieste, Italy
F
Federica Gerace
International School for Advanced Studies, Trieste, Italy
A
A. Laio
International School for Advanced Studies, Trieste, Italy
Sebastian Goldt
Sebastian Goldt
International School of Advanced Studies (SISSA), Trieste, Italy
Theory of Neural NetworksMachine LearningComputational NeuroscienceStochastic Thermodynamics