Approximation Rates for Shallow ReLU$^k$ Neural Networks on Sobolev Spaces via the Radon Transform

📅 2024-08-20

📈 Citations: 4

✨ Influential: 0

career value

190K/year

🤖 AI Summary

This work studies the $L_q(Omega)$-norm approximation of functions in the Sobolev space $W^{s}(L_p(Omega))$ by shallow ReLU$^k$ neural networks on a bounded domain $Omega subset mathbb{R}^d$. To overcome the smoothness barrier imposed by the piecewise polynomial structure of ReLU$^k$ networks, we innovatively integrate Radon transform techniques with discrepancy theory. Our main result establishes an almost-optimal approximation rate of $O(n^{-s/d})$, up to logarithmic factors, under the condition $s leq k + (d+1)/2$. This rate holds for a broad regime where $q leq p$ and $p geq 2$, significantly improving and generalizing prior bounds. The analysis reveals the strong adaptive approximation capability of shallow ReLU$^k$ networks for highly smooth functions, demonstrating that their expressive power extends beyond classical piecewise-polynomial limitations when leveraging geometric integral representations.

Technology Category

Application Category

📝 Abstract

Let $Omegasubset mathbb{R}^d$ be a bounded domain. We consider the problem of how efficiently shallow neural networks with the ReLU$^k$ activation function can approximate functions from Sobolev spaces $W^s(L_p(Omega))$ with error measured in the $L_q(Omega)$-norm. Utilizing the Radon transform and recent results from discrepancy theory, we provide a simple proof of nearly optimal approximation rates in a variety of cases, including when $qleq p$, $pgeq 2$, and $s leq k + (d+1)/2$. The rates we derive are optimal up to logarithmic factors, and significantly generalize existing results. An interesting consequence is that the adaptivity of shallow ReLU$^k$ neural networks enables them to obtain optimal approximation rates for smoothness up to order $s = k + (d+1)/2$, even though they represent piecewise polynomials of fixed degree $k$.

Problem

Research questions and friction points this paper is trying to address.

Efficiency of shallow ReLU^k networks in Sobolev space approximation

Optimal approximation rates via Radon transform and discrepancy theory

Adaptivity enables optimal rates for smoothness up to s=k+(d+1)/2

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Radon transform for approximation rates

Applies shallow ReLU^k neural networks

Optimal rates via discrepancy theory

🔎 Similar Papers

On the optimal approximation of Sobolev and Besov functions using deep ReLU neural networks