π€ AI Summary
This work addresses the susceptibility of machine-learned interatomic potentials to silent failures on out-of-distribution configurations and the absence of efficient, reliable uncertainty quantification. The authors propose a lightweight approach that transforms a deterministic potential into a probabilistic model via learnable function perturbations, fine-tuned end-to-end using the continuous ranked probability score (CRPS) without requiring variational inference, parametric distributional assumptions, or task-specific hyperparameter tuning. Evaluated on N-body charged-particle and silica systems, the method substantially improves uncertainty quality: P-EGNN reduces CRPS by 19β32% compared to BLIP and increases the Spearman correlation between predicted uncertainty and actual error from 0.75 to 0.84.
π Abstract
Machine Learning Interatomic Potentials (MLIPs) achieve near ab initio accuracy at a fraction of the cost of quantum-mechanical simulations, yet they remain prone to silent failures on out-of-distribution configurations, making principled uncertainty quantification (UQ) essential for error-aware simulations and active learning. Existing non-ensemble UQ methods for MLIPs rely either on variational inference or on parametric distributional assumptions, both of which add architectural complexity and hyper-parameters that must be tuned per task. Inspired by recent advances in probabilistic weather forecasting, we propose a simpler alternative: turn a deterministic MLIP into a probabilistic one through learned functional perturbations and finetune it end-to-end with the Continuous Ranked Probability Score (CRPS), a proper scoring rule. We validate the approach with an equivariant GNN (P-EGNN) trained from scratch and by finetuning the foundation model the Orb-v3 for silica. On the N-body charged particle benchmark, P-EGNN improves CRPS over the state-of-the-art Bayesian MLIP method BLIP by 19-32% across all training sizes; on silica, P-Orb raises the Spearman correlation between predicted uncertainty and actual error from 0.75 (BLIP-Orb) to 0.84.