🤖 AI Summary
To address the poor generalization and out-of-distribution (OOD) transferability of machine-learned interatomic potentials (MLIPs) across broad chemical spaces, this work introduces a bias-aware train-test splitting strategy—first explicitly evaluating MLIP performance in chemically unexplored regions (novel compositions, configurations, and topologies). Methodologically, we design an MLIP architecture incorporating long-range electrostatics and van der Waals corrections, coupled with a joint geometry-composition chemical space metric and a rigorous OOD partitioning protocol. Experiments demonstrate that explicit long-range physical modeling is decisive for enhancing transferability; our framework substantially improves prediction accuracy and robustness for complex systems such as metal–organic frameworks. This study establishes a reproducible, verifiable benchmark and design paradigm for universally transferable interatomic potentials.
📝 Abstract
The vastness of chemical space makes generalization a central challenge in the development of machine learning interatomic potentials (MLIPs). While MLIPs could enable large-scale atomistic simulations with near-quantum accuracy, their usefulness is often limited by poor transferability to out-of-distribution samples. Here, we systematically evaluate different MLIP architectures with long-range corrections across diverse chemical spaces and show that such schemes are essential, not only for improving in-distribution performance but, more importantly, for enabling significant gains in transferability to unseen regions of chemical space. To enable a more rigorous benchmarking, we introduce biased train-test splitting strategies, which explicitly test the model performance in significantly different regions of chemical space. Together, our findings highlight the importance of long-range modeling for achieving generalizable MLIPs and provide a framework for diagnosing systematic failures across chemical space. Although we demonstrate our methodology on metal-organic frameworks, it is broadly applicable to other materials, offering insights into the design of more robust and transferable MLIPs.