🤖 AI Summary
This work addresses the challenge of efficiently sampling under limited data budgets to improve the rollout prediction accuracy of neural surrogate models for partial differential equations (PDEs). We propose Gradient-guided Temporal Sampling (GITS), a novel strategy that jointly optimizes local gradient sensitivity of a pilot model and ensemble coverage over the temporal dimension, thereby balancing model specificity with dynamic diversity. GITS overcomes key limitations of existing approaches, which often either concentrate samples in locally high-information-density regions or lack model-aware adaptivity. Extensive experiments across diverse PDE systems, neural architectures, and sampling ratios demonstrate that GITS consistently yields significant reductions in rollout prediction error. Ablation studies further confirm the necessity and complementarity of its dual-objective design.
📝 Abstract
Researchers train neural simulators on uniformly sampled numerical simulation data. But under the same budget, does systematically sampled data provide the most effective information? A fundamental yet unformalized problem is how to sample training data for neural simulators so as to maximize rollout accuracy. Existing data sampling methods either tend to collapse into locally high-information-density regions, or preserve diversity but remain insufficiently model-specific, often leading to performance that is no better than uniform sampling. To address this, we propose a data sampling method tailored to neural simulators, Gradient-Informed Temporal Sampling (GITS). GITS jointly optimizes pilot-model local gradients and set-level temporal coverage, thereby effectively balancing model specificity and dynamical information. Compared with multiple sampling baselines, the data selected by GITS achieves lower rollout error across multiple PDE systems, model backbones and sample ratios. Furthermore, ablation studies demonstrate the necessity and complementarity of the two optimization objectives in GITS. In addition, we analyze the successful sampling patterns of GITS as well as the typical PDE systems and model backbones on which GITS fails.