On the optimal approximation of Sobolev and Besov functions using deep ReLU neural networks

📅 2024-09-02

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 1

career value

224K/year

🤖 AI Summary

This work investigates the optimal $L^p$-approximation capability of deep ReLU networks under joint constraints on width $W$ and depth $L$, for functions in Sobolev spaces $W^{s,q}([0,1]^d)$ and Besov spaces $B^s_{q,r}([0,1]^d)$. We propose a novel sparse vector encoding scheme based on variable-width–depth architectures, integrating tools from approximation theory and function space embeddings. Under generalized Sobolev embedding conditions, we establish— for the first time—the tight convergence rate $O((WL)^{-2s/d})$, up to logarithmic factors, thereby achieving the theoretical optimum. Our analysis unifies and extends prior bounds derived under either fixed-width or fixed-depth assumptions, yielding the sharpest known characterization of expressive power for deep neural networks in terms of the joint $(W,L)$-scaling.

Technology Category

Application Category

📝 Abstract

This paper studies the problem of how efficiently functions in the Sobolev spaces $mathcal{W}^{s,q}([0,1]^d)$ and Besov spaces $mathcal{B}^s_{q,r}([0,1]^d)$ can be approximated by deep ReLU neural networks with width $W$ and depth $L$, when the error is measured in the $L^p([0,1]^d)$ norm. This problem has been studied by several recent works, which obtained the approximation rate $mathcal{O}((WL)^{-2s/d})$ up to logarithmic factors when $p=q=infty$, and the rate $mathcal{O}(L^{-2s/d})$ for networks with fixed width when the Sobolev embedding condition $1/q -1/p<s/d$ holds. We generalize these results by showing that the rate $mathcal{O}((WL)^{-2s/d})$ indeed holds under the Sobolev embedding condition. It is known that this rate is optimal up to logarithmic factors. The key tool in our proof is a novel encoding of sparse vectors by using deep ReLU neural networks with varied width and depth, which may be of independent interest.

Problem

Research questions and friction points this paper is trying to address.

Optimal approximation of Sobolev and Besov functions using ReLU networks

Efficiency of deep ReLU networks in L^p norm error measurement

Generalizing approximation rates under Sobolev embedding conditions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep ReLU networks approximate Sobolev/Besov functions optimally

Varied width-depth encoding enhances sparse vector representation

Achieves O((WL)^(-2s/d)) rate under Sobolev embedding

🔎 Similar Papers

No similar papers found.