π€ AI Summary
This work addresses a critical limitation of machine learning interatomic potentials (MLIPs)βtheir frequent failure to reproduce the physical smoothness of quantum mechanical potential energy surfaces, which can lead to unphysical behaviors in molecular dynamics simulations that conventional energy/force regression metrics fail to detect. To tackle this issue, the authors propose the Bond Smoothness Characterization Test (BSCT), a computationally efficient probe that systematically evaluates potential energy surface smoothness through controlled bond deformations. For the first time, BSCT is integrated as a closed-loop feedback tool to guide MLIP architecture optimization. By incorporating differentiable k-nearest neighbors and temperature-controlled attention mechanisms into a Transformer-based MLIP, and co-optimizing it with BSCT, the resulting model achieves low regression errors while significantly enhancing simulation stability and the reliability of atomic-scale property predictions, outperforming traditional, costly, and limited dynamic evaluation approaches.
π Abstract
Machine Learning Interatomic Potentials (MLIPs) sometimes fail to reproduce the physical smoothness of the quantum potential energy surface (PES), leading to erroneous behavior in downstream simulations that standard energy and force regression evaluations can miss. Existing evaluations, such as microcanonical molecular dynamics (MD), are computationally expensive and primarily probe near-equilibrium states. To improve evaluation metrics for MLIPs, we introduce the Bond Smoothness Characterization Test (BSCT). This efficient benchmark probes the PES via controlled bond deformations and detects non-smoothness, including discontinuities, artificial minima, and spurious forces, both near and far from equilibrium. We show that BSCT correlates strongly with MD stability while requiring a fraction of the cost of MD. To demonstrate how BSCT can guide iterative model design, we utilize an unconstrained Transformer backbone as a testbed, illustrating how refinements such as a new differentiable $k$-nearest neighbors algorithm and temperature-controlled attention reduce artifacts identified by our metric. By optimizing model design systematically based on BSCT, the resulting MLIP simultaneously achieves a low conventional E/F regression error, stable MD simulations, and robust atomistic property predictions. Our results establish BSCT as both a validation metric and as an"in-the-loop"model design proxy that alerts MLIP developers to physical challenges that cannot be efficiently evaluated by current MLIP benchmarks.