Universality and Optimality of Structured Deep Kernel Networks

📅 2021-05-15

🏛️ arXiv.org

📈 Citations: 3

✨ Influential: 0

career value

224K/year

🤖 AI Summary

This work bridges the theoretical gap between kernel methods and deep neural networks. To this end, we propose Structured Deep Kernel Networks (SDKNs)—a novel model that unifies the theoretical rigor of kernel methods with the representational power of deep networks. Our key innovation is the first design of learnable activation functions satisfying the kernel representation theorem, coupled with differentiable, hierarchical structured kernels that enable superior approximation efficiency over ReLU networks in the infinite-depth regime. We theoretically establish universal approximation and optimality of SDKNs under multi-asymptotic regimes—namely, as the number of centers, width, and depth grow. The framework unifies kernel learning and deep learning, yielding simultaneous improvements in approximation rate, interpretability, and training stability. By providing strong theoretical guarantees for high-dimensional nonlinear modeling, SDKNs offer an efficient, principled alternative to conventional deep architectures.

📝 Abstract

Kernel based methods yield approximation models that are flexible, efficient and powerful. In particular, they utilize fixed feature maps of the data, being often associated to strong analytical results that prove their accuracy. On the other hand, the recent success of machine learning methods has been driven by deep neural networks (NNs). They achieve a significant accuracy on very high-dimensional data, in that they are able to learn also efficient data representations or data-based feature maps. In this paper, we leverage a recent deep kernel representer theorem to connect the two approaches and understand their interplay. In particular, we show that the use of special types of kernels yield models reminiscent of neural networks that are founded in the same theoretical framework of classical kernel methods, while enjoying many computational properties of deep neural networks. Especially the introduced Structured Deep Kernel Networks (SDKNs) can be viewed as neural networks with optimizable activation functions obeying a representer theorem. Analytic properties show their universal approximation properties in different asymptotic regimes of unbounded number of centers, width and depth. Especially in the case of unbounded depth, the constructions is asymptotically better than corresponding constructions for ReLU neural networks, which is made possible by the flexibility of kernel approximation

Problem

Research questions and friction points this paper is trying to address.

Connects kernel learning with deep neural networks

Introduces Structured Deep Kernel Networks (SDKNs)

Proves SDKNs' universal approximation properties

Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep kernel representer theorem

Structured Deep Kernel Networks

Optimizable activation functions

🔎 Similar Papers

No similar papers found.

Nvidia

base salary range is 152,000 USD - 218,500 USD for Level 3, and 184,000 USD - 287,500 USD for Level 4

US, CA, Santa Clara

Master Thesis AI-Based Keypoint Refinement for Autonomous Driving

Bosch Group

Hildesheim, NDS, DE

Authors to Follow