🤖 AI Summary
Mantle convection simulations face computational bottlenecks including parameter uncertainty, highly nonlinear viscosity, and integration over billion-year timescales. To address these challenges, we propose a physics-constrained machine learning framework. Our method introduces the first convolutional neural network (CNN) architecture rigorously enforcing mass conservation via divergence-free velocity field prediction, augmented by learnable boundary padding and loss scaling. Trained on only 94 high-fidelity simulations, it achieves cross-parameter generalization. Instead of solving the Stokes equations numerically, the model directly predicts creeping velocity fields satisfying the continuity equation; temperature evolution is advanced autonomously via finite-volume-based autoregressive integration. Evaluated on 2D multi-heater configurations, the approach delivers high-accuracy long-term rollouts with an 89× speedup over conventional solvers and demonstrates strong transferability to unseen physical configurations.
📝 Abstract
Mantle convection simulations are an essential tool for understanding how rocky planets evolve. However, the poorly known input parameters to these simulations, the non-linear dependence of transport properties on pressure and temperature, and the long integration times in excess of several billion years all pose a computational challenge for numerical solvers. We propose a physics-based machine learning approach that predicts creeping flow velocities as a function of temperature while conserving mass, thereby bypassing the numerical solution of the Stokes problem. A finite-volume solver then uses the predicted velocities to advect and diffuse the temperature field to the next time-step, enabling autoregressive rollout at inference. For training, our model requires temperature-velocity snapshots from a handful of simulations (94). We consider mantle convection in a two-dimensional rectangular box with basal and internal heating, pressure- and temperature-dependent viscosity. Overall, our model is up to 89 times faster than the numerical solver. We also show the importance of different components in our convolutional neural network architecture such as mass conservation, learned paddings on the boundaries, and loss scaling for the overall rollout performance. Finally, we test our approach on unseen scenarios to demonstrate some of its strengths and weaknesses.