🤖 AI Summary
This work unifies neural network learning and kernel learning theory by bridging the intrinsic connections between infinitely wide neural networks, the Neural Network Gaussian Process (NNGP), and the Neural Tangent Kernel (NTK). Method: We propose the Unified Neural Kernel (UNK), constructed as the inner product of gradient-descent-generated variables, jointly capturing training dynamics and initialization effects. UNK asymptotically unifies NNGP (the Bayesian zeroth-order limit) and NTK (the first-order tangent-space limit): it approximates NTK behavior in finite steps and converges to NNGP in the infinite-step limit. Theoretically, we establish uniform tightness and learning convergence guarantees for UNK, leveraging function-space analysis, random matrix theory, and gradient flow modeling. Results: Empirical evaluation across multiple benchmarks demonstrates that UNK significantly outperforms standalone NNGP or NTK, achieving superior generalization performance and enhanced training stability.
📝 Abstract
Past decades have witnessed a great interest in the distinction and connection between neural network learning and kernel learning. Recent advancements have made theoretical progress in connecting infinite-wide neural networks and Gaussian processes. Two predominant approaches have emerged: the Neural Network Gaussian Process (NNGP) and the Neural Tangent Kernel (NTK). The former, rooted in Bayesian inference, represents a zero-order kernel, while the latter, grounded in the tangent space of gradient descents, is a first-order kernel. In this paper, we present the Unified Neural Kernel (UNK), which {is induced by the inner product of produced variables and characterizes the learning dynamics of neural networks with gradient descents and parameter initialization.} The proposed UNK kernel maintains the limiting properties of both NNGP and NTK, exhibiting behaviors akin to NTK with a finite learning step and converging to NNGP as the learning step approaches infinity. Besides, we also theoretically characterize the uniform tightness and learning convergence of the UNK kernel, providing comprehensive insights into this unified kernel. Experimental results underscore the effectiveness of our proposed method.