🤖 AI Summary
This work addresses the scalability challenges of high-dimensional, high-order physics-informed neural networks (PINNs), which suffer from prohibitive spatial derivative computation complexity 𝒪(d^k) and backpropagation memory overhead 𝒪(P). Existing zeroth-order optimization methods often exhibit numerical divergence due to variance explosion under stochastic gradient estimation. To overcome these limitations, we propose the SDZE framework, which introduces a novel Common Random Numbers synchronization mechanism to effectively suppress variance explosion and integrates an implicit matrix-free subspace projection technique. This reduces the parameter exploration variance from 𝒪(P) to 𝒪(r), achieving zeroth-order optimization with both spatial and memory complexities independent of dimensionality. Experiments demonstrate that our method successfully trains PINNs with tens of millions of parameters on a single NVIDIA A100 GPU, significantly outperforming state-of-the-art approaches in both speed and memory efficiency.
📝 Abstract
Physics-Informed Neural Networks (PINNs) for high-dimensional and high-order partial differential equations (PDEs) are primarily constrained by the $\mathcal{O}(d^k)$ spatial derivative complexity and the $\mathcal{O}(P)$ memory overhead of backpropagation (BP). While randomized spatial estimators successfully reduce the spatial complexity to $\mathcal{O}(1)$, their reliance on first-order optimization still leads to prohibitive memory consumption at scale. Zeroth-order (ZO) optimization offers a BP-free alternative; however, naively combining randomized spatial operators with ZO perturbations triggers a variance explosion of $\mathcal{O}(1/\varepsilon^2)$, leading to numerical divergence. To address these challenges, we propose the \textbf{S}tochastic \textbf{D}imension-free \textbf{Z}eroth-order \textbf{E}stimator (\textbf{SDZE}), a unified framework that achieves dimension-independent complexity in both space and memory. Specifically, SDZE leverages \emph{Common Random Numbers Synchronization (CRNS)} to algebraically cancel the $\mathcal{O}(1/\varepsilon^2)$ variance by locking spatial random seeds across perturbations. Furthermore, an \emph{implicit matrix-free subspace projection} is introduced to reduce parameter exploration variance from $\mathcal{O}(P)$ to $\mathcal{O}(r)$ while maintaining an $\mathcal{O}(1)$ optimizer memory footprint. Empirical results demonstrate that SDZE enables the training of 10-million-dimensional PINNs on a single NVIDIA A100 GPU, delivering significant improvements in speed and memory efficiency over state-of-the-art baselines.