🤖 AI Summary
This work investigates the spontaneous emergence and detection of interpretable computational structures—such as collision-detection modules—in Transformer-like architectures applied to particle physics simulation.
Method: We construct an attention-based physics simulator and employ multi-faceted analysis: attention-head decomposition, loss-landscape geometric modeling, and dynamical trajectory tracking during training.
Contribution/Results: We establish, for the first time, an intrinsic link between structural emergence and parameter-space degeneracy geometry: emergence follows a power-law scaling law governed by a “degeneracy-effective potential.” Experiments successfully identify dedicated attention heads performing particle collision detection; their formation coincides with pronounced parameter degeneracy in the loss landscape. Crucially, early-stage component dynamics enable predictive detection of such structures. Our framework provides a novel paradigm and a quantifiable theoretical foundation for uncovering algorithmic structures intrinsically encoded in neural networks.
📝 Abstract
Neural networks often have identifiable computational structures - components of the network which perform an interpretable algorithm or task - but the mechanisms by which these emerge and the best methods for detecting these structures are not well understood. In this paper we investigate the emergence of computational structure in a transformer-like model trained to simulate the physics of a particle system, where the transformer's attention mechanism is used to transfer information between particles. We show that (a) structures emerge in the attention heads of the transformer which learn to detect particle collisions, (b) the emergence of these structures is associated to degenerate geometry in the loss landscape, and (c) the dynamics of this emergence follows a power law. This suggests that these components are governed by a degenerate"effective potential". These results have implications for the convergence time of computational structure within neural networks and suggest that the emergence of computational structure can be detected by studying the dynamics of network components.