Depth-induced NTK: Bridging Over-parameterized Neural Networks and Deep Neural Kernels

📅 2025-11-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing NTK theory primarily applies to infinite-width networks, neglecting the impact of depth on representation learning. This work systematically incorporates network depth as an explicit variable, proposing the Depth-driven Neural Tangent Kernel (D-NTK): finite-depth networks are mapped to Gaussian processes via skip connections, ensuring kernel convergence as depth tends to infinity. Theoretically, we characterize D-NTK’s dynamic stability and spectral properties, proving its training invariance and ability to suppress feature collapse. By integrating functional-space analysis with spectral decomposition, we establish a rigorous theoretical linkage among depth, kernel behavior, and generalization. Empirically, D-NTK consistently outperforms standard NTK on image classification and regression tasks, demonstrating enhanced expressive power and generalization performance—thereby bridging the theoretical gap between infinite-width and finite-depth regimes.

Technology Category

Application Category

📝 Abstract
While deep learning has achieved remarkable success across a wide range of applications, its theoretical understanding of representation learning remains limited. Deep neural kernels provide a principled framework to interpret over-parameterized neural networks by mapping hierarchical feature transformations into kernel spaces, thereby combining the expressive power of deep architectures with the analytical tractability of kernel methods. Recent advances, particularly neural tangent kernels (NTKs) derived by gradient inner products, have established connections between infinitely wide neural networks and nonparametric Bayesian inference. However, the existing NTK paradigm has been predominantly confined to the infinite-width regime, while overlooking the representational role of network depth. To address this gap, we propose a depth-induced NTK kernel based on a shortcut-related architecture, which converges to a Gaussian process as the network depth approaches infinity. We theoretically analyze the training invariance and spectrum properties of the proposed kernel, which stabilizes the kernel dynamics and mitigates degeneration. Experimental results further underscore the effectiveness of our proposed method. Our findings significantly extend the existing landscape of the neural kernel theory and provide an in-depth understanding of deep learning and the scaling law.
Problem

Research questions and friction points this paper is trying to address.

Bridging over-parameterized neural networks with deep neural kernels theoretically
Extending NTK beyond infinite-width to incorporate network depth effects
Analyzing training invariance and spectrum to stabilize kernel dynamics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Depth-induced NTK kernel with shortcut architecture
Converges to Gaussian process at infinite depth
Stabilizes kernel dynamics and mitigates degeneration
🔎 Similar Papers
No similar papers found.
Yong-Ming Tian
Yong-Ming Tian
State Key Laboratory of Novel Software Technology, Nanjing University, China
S
Shuang Liang
State Key Laboratory of Novel Software Technology, Nanjing University, China
S
Shao-Qun Zhang
State Key Laboratory of Novel Software Technology, Nanjing University, China
Feng-Lei Fan
Feng-Lei Fan
Assistant Professor, City University of Hong Kong
NeuroAIData ScienceMedical ImagingApplied Math