Function graph transformers universally approximate operators between function spaces

📅 2026-05-18

📈 Citations: 0

✨ Influential: 0

career value

162K/year

🤖 AI Summary

This work addresses the challenge of stably and efficiently approximating nonlinear operators between function spaces under varying discretizations and output domains using Transformer architectures. To this end, it proposes a graph-preserving Functional Graph Transformer that lifts input functions into graph-measure representations, formulating operator learning within a measure-theoretic framework. The graph-preserving structure guarantees that outputs remain single-valued functions while naturally supporting discretization refinement and cross-domain queries. This approach is the first to unify the treatment of negative-order Sobolev inputs, positional encoding effects, and discretization consistency. Theoretically, it establishes that a finite-depth self-attention mechanism combined with MLPs can universally approximate a broad class of nonlinear operators, achieving both expressive power and generalization consistency across discretizations.

📝 Abstract

We study the approximation of nonlinear operators between function spaces by transformers. Our approach is to lift functions to measures supported on their graphs and leverage a recently introduced measure-theoretic view of transformers. A function $h$ is represented by its graph measure $γ_h$, with finite tokens $\{(x_j,h(x_j))\}_{j=1}^N$ being its empirical approximations. We show that this framework elegantly models discretization refinement via convergence of measures and provides a natural setting for operator learning. Within this framework, we introduce function graph transformers, a graph-preserving subclass of measure-theoretic transformers that maps graph measures to graph measures, which is to say that outputs remain single-valued functions. Crucially, this additional structure does not reduce generality: we prove that the resulting graph-preserving maps can be approximated by finite compositions of standard softmax self-attention layers and pointwise MLPs, yielding universal approximation results for broad classes of nonlinear operators. Unlike existing theoretical approaches to operator learning with transformers, the measure-theoretic framework also accommodates regularized negative-order Sobolev inputs for which discretization invariance is particularly challenging, as well as query points on different output domains. Overall, function graph transformers provide a continuum viewpoint and mathematical toolkit for transformer-based operator learning, clarifying the roles of positional encodings, graph structure, regularization, and ensuring consistency across discretizations.

Problem

Research questions and friction points this paper is trying to address.

operator learning

function spaces

transformers

discretization invariance

nonlinear operators

Innovation

Methods, ideas, or system contributions that make the work stand out.

function graph transformers

measure-theoretic transformers

operator learning