Random Quadratic Form on a Sphere: Synchronization by Common Noise

📅 2026-03-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates the emergence of token clustering in deep Transformers even in the absence of self-attention, revealing that shared noise can independently drive linear layers to induce token synchronization. To this end, the authors propose a Random Quadratic Form (RQF) model, characterizing system dynamics via gradient flows of random quadratic functionals on the sphere. By leveraging stochastic differential equations, invariant measures, and the theory of random attractors, they analyze the synchronization properties of two-point motions and the structural distribution of solutions. The work establishes, for the first time, that common noise alone—without self-attention—can induce token clustering through linear components, thereby highlighting the critical role of linear structures in representation learning and offering a novel perspective on the internal mechanisms of Transformers.

Technology Category

Application Category

📝 Abstract
We introduce the Random Quadratic Form (RQF): a stochastic differential equation which formally corresponds to the gradient flow of a random quadratic functional on a sphere. While the one-point dynamics of the system is a Brownian motion and thus has no preferred direction, the two-point motion exhibits nontrivial synchronizing behaviour. In this work we study synchronization of the RQF, namely we give both distributional and path-wise characterizations of the solutions by studying invariant measures and random attractors of the system. The RQF model is motivated by the study of the role of linear layers in transformers and illustrates the synchronization by common noise phenomena arising in the simplified models of transformers. In particular, we provide an alternative (independent of self-attention) explanation of the clustering behaviour in deep transformers and show that tokens cluster even in the absence of the self-attention mechanism.
Problem

Research questions and friction points this paper is trying to address.

synchronization
common noise
transformers
clustering
random quadratic form
Innovation

Methods, ideas, or system contributions that make the work stand out.

Random Quadratic Form
synchronization by common noise
random attractors
invariant measures
transformer clustering
🔎 Similar Papers
No similar papers found.
Maximilian Engel
Maximilian Engel
University of Amsterdam, FU Berlin
Stochastic and Multiscale Dynamical Systems
A
Anna Shalova
Korteweg-de Vries Institute for Mathematics, University of Amsterdam