🤖 AI Summary
This work addresses the lack of theoretical foundations for the statistical accuracy of generative models in scientific computing, particularly concerning probability measures induced by partial differential equations (PDEs). We establish a theoretical framework that proves, for the first time, that PDE-induced measures satisfy a doubling condition, which enables us to derive Hölder regularity of the optimal transport map from the uniform distribution to the target measure. Leveraging this regularity, structural assumptions on PDE solutions, and tools from statistical learning theory, we analyze single-step Wasserstein-guided generative models such as DeepParticle. We derive an excess risk bound between the learned map and the population-optimal transport map, establish robustness under distribution shift, and empirically validate the predicted theoretical convergence rates.
📝 Abstract
Despite the remarkable empirical success of generative models, the available theory on their statistical accuracy in scientific computing remains largely pessimistic. This paper develops a theoretical framework for understanding the regularity of transport maps and the generalization properties of one-step Wasserstein-guided generative models for PDE-induced probability measures. We consider normalized target densities associated with linear elliptic and parabolic equations on bounded domains, as well as diffusion and Fokker--Planck equations on the torus. Under standard structural assumptions, we prove that these target measures satisfy doubling conditions. By combining this fact with regularity theory for optimal transport between doubling measures, we show that the optimal transport map from a uniform source measure to the target measure is Hölder continuous. This regularity yields an approximation-theoretic justification for one-step generative models that learn PDE-induced distributions via a single pushforward map. As a representative instance, we study DeepParticle and derive excess-risk bounds characterizing the discrepancy between the learned map and the population-optimal map. We also establish a robustness estimate under target shift and illustrate the theory with experiments which support the derived rates.