A Unified Kernel for Neural Network Learning

📅 2024-03-26
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This work unifies neural network learning and kernel learning theory by bridging the intrinsic connections between infinitely wide neural networks, the Neural Network Gaussian Process (NNGP), and the Neural Tangent Kernel (NTK). Method: We propose the Unified Neural Kernel (UNK), constructed as the inner product of gradient-descent-generated variables, jointly capturing training dynamics and initialization effects. UNK asymptotically unifies NNGP (the Bayesian zeroth-order limit) and NTK (the first-order tangent-space limit): it approximates NTK behavior in finite steps and converges to NNGP in the infinite-step limit. Theoretically, we establish uniform tightness and learning convergence guarantees for UNK, leveraging function-space analysis, random matrix theory, and gradient flow modeling. Results: Empirical evaluation across multiple benchmarks demonstrates that UNK significantly outperforms standalone NNGP or NTK, achieving superior generalization performance and enhanced training stability.

Technology Category

Application Category

📝 Abstract
Past decades have witnessed a great interest in the distinction and connection between neural network learning and kernel learning. Recent advancements have made theoretical progress in connecting infinite-wide neural networks and Gaussian processes. Two predominant approaches have emerged: the Neural Network Gaussian Process (NNGP) and the Neural Tangent Kernel (NTK). The former, rooted in Bayesian inference, represents a zero-order kernel, while the latter, grounded in the tangent space of gradient descents, is a first-order kernel. In this paper, we present the Unified Neural Kernel (UNK), which {is induced by the inner product of produced variables and characterizes the learning dynamics of neural networks with gradient descents and parameter initialization.} The proposed UNK kernel maintains the limiting properties of both NNGP and NTK, exhibiting behaviors akin to NTK with a finite learning step and converging to NNGP as the learning step approaches infinity. Besides, we also theoretically characterize the uniform tightness and learning convergence of the UNK kernel, providing comprehensive insights into this unified kernel. Experimental results underscore the effectiveness of our proposed method.
Problem

Research questions and friction points this paper is trying to address.

Unifying NNGP and NTK kernels for neural network learning dynamics
Characterizing gradient descent behavior with finite and infinite steps
Providing theoretical convergence guarantees for the unified kernel
Innovation

Methods, ideas, or system contributions that make the work stand out.

UNK kernel unifies NNGP and NTK properties
Characterizes neural network learning dynamics with gradients
Maintains limiting behaviors of both zero and first-order kernels
🔎 Similar Papers
No similar papers found.
S
Shao-Qun Zhang
State Key Laboratory of Novel Software Technology, Nanjing University, China; School of Intelligent Science and Technology, Nanjing University, China
Z
Zong-Yi Chen
State Key Laboratory of Novel Software Technology, Nanjing University, China; School of Intelligent Science and Technology, Nanjing University, China
Yong-Ming Tian
Yong-Ming Tian
State Key Laboratory of Novel Software Technology, Nanjing University, China; School of Intelligent Science and Technology, Nanjing University, China
Xun Lu
Xun Lu
Department of Economics, Chinese University of Hong Kong (CUHK)
Econometrics