Ensembles provably learn equivariance through data augmentation

📅 2024-10-02

🏛️ arXiv.org

📈 Citations: 5

✨ Influential: 0

career value

236K/year

🤖 AI Summary

This work investigates the mechanism by which neural network ensembles spontaneously develop group equivariance under data augmentation, specifically addressing whether this phenomenon fundamentally relies on the infinite-width assumption or the neural tangent kernel (NTK) regime. Method: We establish, for the first time, a rigorous proof that equivariance emerges naturally in finite-width networks with generic architectures and stochastic training—without invoking the NTK limit. Our analysis integrates functional analysis, group representation theory, and stochastic optimization, yielding a simple, sufficient condition characterizing architectural compatibility with group actions; this unifies the equivariant learning behavior of CNNs, MLPs, and other architectures. Extensive numerical experiments validate the theoretical predictions. Contribution/Results: We break free from the NTK framework, establishing a more general and broadly applicable theory of emergent equivariance. This advances interpretable modeling and symmetry-driven learning by providing a foundational, architecture-agnostic explanation for how invariance and equivariance arise intrinsically during training.

Technology Category

Application Category

📝 Abstract

Recently, it was proved that group equivariance emerges in ensembles of neural networks as the result of full augmentation in the limit of infinitely wide neural networks (neural tangent kernel limit). In this paper, we extend this result significantly. We provide a proof that this emergence does not depend on the neural tangent kernel limit at all. We also consider stochastic settings, and furthermore general architectures. For the latter, we provide a simple sufficient condition on the relation between the architecture and the action of the group for our results to hold. We validate our findings through simple numeric experiments.

Problem

Research questions and friction points this paper is trying to address.

Proving equivariance emergence without neural tangent kernel limit

Extending results to stochastic settings and general architectures

Validating findings through simple numeric experiments

Innovation

Methods, ideas, or system contributions that make the work stand out.

Ensembles learn equivariance via data augmentation

Proof independent of neural tangent kernel limit

General architectures with simple sufficient condition

🔎 Similar Papers

No similar papers found.