🤖 AI Summary
This work addresses the lack of quantitative generalization guarantees in existing neural operator methods for multi-operator learning, particularly concerning unseen operator instances, input functions, and evaluation points. Focusing on multi-task and multi-operator settings under hierarchical sampling, the study establishes the first explicit generalization error bound—based on metric entropy—for models built upon the Multiple Neural Operator (MNO) architecture. By integrating linear combinations of products of deep ReLU subnetworks with covering number analysis and approximation theory, the authors derive an approximation–estimation trade-off expression for the expected test error. This bound precisely characterizes how sampling budgets at the three levels—operators, inputs, and evaluation points—affect generalization performance, and yields explicit sample complexity and learning rates under operator sampling complexity constraints.
📝 Abstract
Multiple operator learning concerns learning operator families $\{G[α]:U\to V\}_{α\in W}$ indexed by an operator descriptor $α$. Training data are collected hierarchically by sampling operator instances $α$, then input functions $u$ per instance, and finally evaluation points $x$ per input, yielding noisy observations of $G[α][u](x)$. While recent work has developed expressive multi-task and multiple operator learning architectures and approximation-theoretic scaling laws, quantitative statistical generalization guarantees remain limited. We provide a covering-number-based generalization analysis for separable models, focusing on the Multiple Neural Operator (MNO) architecture: we first derive explicit metric-entropy bounds for hypothesis classes given by linear combinations of products of deep ReLU subnetworks, and then combine these complexity bounds with approximation guarantees for MNO to obtain an explicit approximation-estimation tradeoff for the expected test error on new (unseen) triples $(α,u,x)$. The resulting bound makes the dependence on the hierarchical sampling budgets $(n_α,n_u,n_x)$ transparent and yields an explicit learning-rate statement in the operator-sampling budget $n_α$, providing a sample-complexity characterization for generalization across operator instances. The structure and architecture can also be viewed as a general purpose solver or an example of a "small'' PDE foundation model, where the triples are one form of multi-modality.