🤖 AI Summary
Existing PDE foundation models predominantly rely on deterministic Transformers, suffering from limited generation flexibility and severe error accumulation—particularly in long-horizon forecasting. Method: We propose Flow Marching (FM), the first generative PDE foundation model integrating neural operators with flow matching for uncertainty-aware physical sequence modeling. Specifically: (i) we jointly sample noise scales and physical time steps to learn a unified velocity field that mitigates rollout drift; (ii) we design a Physics-Pretrained Variational Autoencoder (P2VAE) and Flow Marching Transformer (FMT) to enable large-scale pretraining and multi-scale inference; (iii) we incorporate diffusion guidance and a latent temporal pyramid to enhance generation fidelity. Results: Trained on 2.5 million trajectories across 12 PDE families, our model achieves stable few-shot long-range prediction, hierarchical uncertainty quantification, and a 15× improvement in computational efficiency.
📝 Abstract
Pretraining on large-scale collections of PDE-governed spatiotemporal trajectories has recently shown promise for building generalizable models of dynamical systems. Yet most existing PDE foundation models rely on deterministic Transformer architectures, which lack generative flexibility for many science and engineering applications. We propose Flow Marching, an algorithm that bridges neural operator learning with flow matching motivated by an analysis of error accumulation in physical dynamical systems, and we build a generative PDE foundation model on top of it. By jointly sampling the noise level and the physical time step between adjacent states, the model learns a unified velocity field that transports a noisy current state toward its clean successor, reducing long-term rollout drift while enabling uncertainty-aware ensemble generations. Alongside this core algorithm, we introduce a Physics-Pretrained Variational Autoencoder (P2VAE) to embed physical states into a compact latent space, and an efficient Flow Marching Transformer (FMT) that combines a diffusion-forcing scheme with latent temporal pyramids, achieving up to 15x greater computational efficiency than full-length video diffusion models and thereby enabling large-scale pretraining at substantially reduced cost. We curate a corpus of ~2.5M trajectories across 12 distinct PDE families and train suites of P2VAEs and FMTs at multiple scales. On downstream evaluation, we benchmark on unseen Kolmogorov turbulence with few-shot adaptation, demonstrate long-term rollout stability over deterministic counterparts, and present uncertainty-stratified ensemble results, highlighting the importance of generative PDE foundation models for real-world applications.