🤖 AI Summary
This work investigates the mechanism by which neural network ensembles spontaneously develop group equivariance under data augmentation, specifically addressing whether this phenomenon fundamentally relies on the infinite-width assumption or the neural tangent kernel (NTK) regime.
Method: We establish, for the first time, a rigorous proof that equivariance emerges naturally in finite-width networks with generic architectures and stochastic training—without invoking the NTK limit. Our analysis integrates functional analysis, group representation theory, and stochastic optimization, yielding a simple, sufficient condition characterizing architectural compatibility with group actions; this unifies the equivariant learning behavior of CNNs, MLPs, and other architectures. Extensive numerical experiments validate the theoretical predictions.
Contribution/Results: We break free from the NTK framework, establishing a more general and broadly applicable theory of emergent equivariance. This advances interpretable modeling and symmetry-driven learning by providing a foundational, architecture-agnostic explanation for how invariance and equivariance arise intrinsically during training.
📝 Abstract
Recently, it was proved that group equivariance emerges in ensembles of neural networks as the result of full augmentation in the limit of infinitely wide neural networks (neural tangent kernel limit). In this paper, we extend this result significantly. We provide a proof that this emergence does not depend on the neural tangent kernel limit at all. We also consider stochastic settings, and furthermore general architectures. For the latter, we provide a simple sufficient condition on the relation between the architecture and the action of the group for our results to hold. We validate our findings through simple numeric experiments.