A Unified Kernel for Neural Network Learning

📅 2024-03-26

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

231K/year

🤖 AI Summary

This work unifies neural network learning and kernel learning theory by bridging the intrinsic connections between infinitely wide neural networks, the Neural Network Gaussian Process (NNGP), and the Neural Tangent Kernel (NTK). Method: We propose the Unified Neural Kernel (UNK), constructed as the inner product of gradient-descent-generated variables, jointly capturing training dynamics and initialization effects. UNK asymptotically unifies NNGP (the Bayesian zeroth-order limit) and NTK (the first-order tangent-space limit): it approximates NTK behavior in finite steps and converges to NNGP in the infinite-step limit. Theoretically, we establish uniform tightness and learning convergence guarantees for UNK, leveraging function-space analysis, random matrix theory, and gradient flow modeling. Results: Empirical evaluation across multiple benchmarks demonstrates that UNK significantly outperforms standalone NNGP or NTK, achieving superior generalization performance and enhanced training stability.

Technology Category

Application Category

📝 Abstract

Past decades have witnessed a great interest in the distinction and connection between neural network learning and kernel learning. Recent advancements have made theoretical progress in connecting infinite-wide neural networks and Gaussian processes. Two predominant approaches have emerged: the Neural Network Gaussian Process (NNGP) and the Neural Tangent Kernel (NTK). The former, rooted in Bayesian inference, represents a zero-order kernel, while the latter, grounded in the tangent space of gradient descents, is a first-order kernel. In this paper, we present the Unified Neural Kernel (UNK), which {is induced by the inner product of produced variables and characterizes the learning dynamics of neural networks with gradient descents and parameter initialization.} The proposed UNK kernel maintains the limiting properties of both NNGP and NTK, exhibiting behaviors akin to NTK with a finite learning step and converging to NNGP as the learning step approaches infinity. Besides, we also theoretically characterize the uniform tightness and learning convergence of the UNK kernel, providing comprehensive insights into this unified kernel. Experimental results underscore the effectiveness of our proposed method.

Problem

Research questions and friction points this paper is trying to address.

Unifying NNGP and NTK kernels for neural network learning dynamics

Characterizing gradient descent behavior with finite and infinite steps

Providing theoretical convergence guarantees for the unified kernel

Innovation

Methods, ideas, or system contributions that make the work stand out.

UNK kernel unifies NNGP and NTK properties

Characterizes neural network learning dynamics with gradients

Maintains limiting behaviors of both zero and first-order kernels

🔎 Similar Papers

Artificial neural networks on graded vector spaces