Equivariant Neural Tangent Kernels

📅 2024-06-10
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the lack of a unified theoretical framework for understanding the training dynamics of equivariant neural networks (ENNs) and their relationship with data augmentation. We introduce the first general equivariant neural tangent kernel (Equivariant NTK) theory, rigorously proving that, in the infinite-width limit, the prediction trajectory of a wide equivariant convolutional network on arbitrary inputs—including those outside the manifold—is identical to that of standard data-augmented training under the corresponding group action. Methodologically, we integrate group representation theory, equivariant convolution, and NTK analysis to derive explicit Equivariant NTK constructions for SO(3) and the Euclidean symmetry group Cₙ ⋉ ℝ². Experiments on histopathological image classification and quantum mechanical property prediction demonstrate that the Equivariant NTK, used as a kernel method, significantly outperforms non-equivariant baselines. Finite-width experiments further confirm the strong robustness of our theoretical predictions.

Technology Category

Application Category

📝 Abstract
Little is known about the training dynamics of equivariant neural networks, in particular how it compares to data augmented training of their non-equivariant counterparts. Recently, neural tangent kernels (NTKs) have emerged as a powerful tool to analytically study the training dynamics of wide neural networks. In this work, we take an important step towards a theoretical understanding of training dynamics of equivariant models by deriving neural tangent kernels for a broad class of equivariant architectures based on group convolutions. As a demonstration of the capabilities of our framework, we show an interesting relationship between data augmentation and group convolutional networks. Specifically, we prove that they share the same expected prediction at all training times and even off-manifold. In this sense, they have the same training dynamics. We demonstrate in numerical experiments that this still holds approximately for finite-width ensembles. By implementing equivariant NTKs for roto-translations in the plane ($G=C_{n}ltimesmathbb{R}^{2}$) and 3d rotations ($G=mathrm{SO}(3)$), we show that equivariant NTKs outperform their non-equivariant counterparts as kernel predictors for histological image classification and quantum mechanical property prediction.
Problem

Research questions and friction points this paper is trying to address.

Equivariant Neural Networks
Data Augmentation
Specific Data Types
Innovation

Methods, ideas, or system contributions that make the work stand out.

Equivariant Neural Tangent Kernels
Data Augmentation Similarity
Superior Performance in Image Classification and Quantum Mechanics
🔎 Similar Papers
No similar papers found.
P
Philipp Misof
Pan Kessel
Pan Kessel
TU Berlin
Machine LearningTheoretical Physics
J
Jan E. Gerken