🤖 AI Summary
This work addresses the problem of effectively approximating the heat semigroup operator on an unknown manifold using a graph transition matrix constructed from a finite sample. The authors propose a novel approach based on Gaussian kernel graphs, iterative application of the graph transition matrix, and right normalization. Under minimal assumptions—namely, that the target function belongs to \(L^\infty\) (implying low regularity) and that the sampling density is non-uniform—the method achieves, for the first time, uniform approximation of the manifold heat semigroup by graph diffusion. It supports both in-sample and out-of-sample estimation and attains a pointwise convergence rate of \(O(N^{-2/(d+6)})\) in the infinity norm (up to logarithmic factors), valid for diffusion times as large as \(O(1)\) or greater. Numerical experiments confirm the theoretical findings.
📝 Abstract
We consider graph diffusion processes constructed from finite i.i.d. samples drawn from an unknown manifold embedded in ambient Euclidean space, where the graph affinity is defined by an ambient Gaussian kernel matrix. We show that the manifold heat semigroup $Q_t = e^{tΔ}$ can be approximated directly by iterating the graph transition matrix $P$, under only low regularity assumptions on the test function $f$, including the case $f \in L^\infty$. We bound $\| P^n f - Q_t f \|$ in $\infty$-norm, with the operator application to $f$ properly defined, and we recover the classical graph-Laplacian pointwise rate $O(N^{-2/(d+6)})$ up to logarithmic factors, for diffusion times $t $ up to $O(1)$ and longer. The rate holds for in-sample error as well as out-of-sample generalization, where the estimator of $Q_t f$ at a new point is defined via kernel convolution. To handle non-uniform sampling densities on the manifold, we introduce a right-normalization of the graph transition matrix; under the assumption that the sampling density $p$ is $C^3$ and bounded away from zero, the same convergence rates hold. We numerically demonstrate the performance of the proposed estimator on simulated data.