Theory-to-Practice Gap for Neural Networks and Neural Operators

📅 2025-03-23

📈 Citations: 1

✨ Influential: 0

career value

200K/year

🤖 AI Summary

This work investigates the sampling complexity of ReLU neural networks and neural operators when learning mappings belonging to specific approximation spaces, focusing on the “theory–practice gap” between theoretically optimal convergence rates and empirically attainable ones. Within a unified $L^p$ framework, we systematically extend this gap to infinite-dimensional operator learning—its first such treatment—using Bochner integration and approximation-theoretic tools to rigorously establish that the optimal sampling convergence rate is fundamentally limited by the $1/p$-order Monte Carlo rate. Our analysis encompasses mainstream architectures including DeepONet and the Fourier Neural Operator (FNO), and improves upon existing convergence upper bounds. Key contributions are: (1) a fine-grained separation of parameter complexity from sampling complexity; (2) identification of an intrinsic Monte Carlo–type convergence bottleneck in infinite-dimensional operator learning; and (3) provision of tight theoretical limits to guide principled neural operator design.

Technology Category

Application Category

📝 Abstract

This work studies the sampling complexity of learning with ReLU neural networks and neural operators. For mappings belonging to relevant approximation spaces, we derive upper bounds on the best-possible convergence rate of any learning algorithm, with respect to the number of samples. In the finite-dimensional case, these bounds imply a gap between the parametric and sampling complexities of learning, known as the emph{theory-to-practice gap}. In this work, a unified treatment of the theory-to-practice gap is achieved in a general $L^p$-setting, while at the same time improving available bounds in the literature. Furthermore, based on these results the theory-to-practice gap is extended to the infinite-dimensional setting of operator learning. Our results apply to Deep Operator Networks and integral kernel-based neural operators, including the Fourier neural operator. We show that the best-possible convergence rate in a Bochner $L^p$-norm is bounded by Monte-Carlo rates of order $1/p$.

Problem

Research questions and friction points this paper is trying to address.

Studies sampling complexity of ReLU networks and neural operators

Derives upper bounds on learning convergence rates

Extends theory-to-practice gap to infinite-dimensional operator learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Derives upper bounds on learning convergence rates

Unifies theory-to-practice gap in L^p-setting

Extends gap analysis to infinite-dimensional operator learning

🔎 Similar Papers

2023-05-10ACM Computing SurveysCitations: 60