🤖 AI Summary
This paper addresses the fundamental problem of unclear generalization guarantees in machine learning function approximation—specifically, the lack of theoretical foundations for model performance on unknown manifolds and unseen data. Methodologically, it departs from explicit geometric modeling of manifolds (e.g., via Laplace–Beltrami operators or atlases) and instead integrates classical approximation theory, spectral graph theory, differential geometry, and physics-informed embedding to systematically characterize the approximation mechanisms of diverse paradigms—including deep/shallow neural networks, neural operators, Transformers, and physics-informed neural surrogates. Key contributions include: (i) a precise delineation of expressive capacity boundaries across mainstream architectures; (ii) a robust generalization analysis framework that requires no prior knowledge of manifold geometry; and (iii) the first unifying theoretical perspective for manifold learning and scientific machine learning grounded explicitly in approximation theory.
📝 Abstract
A central problem in machine learning is often formulated as follows: Given a dataset ${(x_j, y_j)}_{j=1}^M$, which is a sample drawn from an unknown probability distribution, the goal is to construct a functional model $f$ such that $f(x) approx y$ for any $(x, y)$ drawn from the same distribution. Neural networks and kernel-based methods are commonly employed for this task due to their capacity for fast and parallel computation. The approximation capabilities, or expressive power, of these methods have been extensively studied over the past 35 years. In this paper, we will present examples of key ideas in this area found in the literature. We will discuss emerging trends in machine learning including the role of shallow/deep networks, approximation on manifolds, physics-informed neural surrogates, neural operators, and transformer architectures. Despite function approximation being a fundamental problem in machine learning, approximation theory does not play a central role in the theoretical foundations of the field. One unfortunate consequence of this disconnect is that it is often unclear how well trained models will generalize to unseen or unlabeled data. In this review, we examine some of the shortcomings of the current machine learning framework and explore the reasons for the gap between approximation theory and machine learning practice. We will then introduce our novel research to achieve function approximation on unknown manifolds without the need to learn specific manifold features, such as the eigen-decomposition of the Laplace-Beltrami operator or atlas construction. In many machine learning problems, particularly classification tasks, the labels $y_j$ are drawn from a finite set of values.