Generalization of Long-Range Machine Learning Potentials in Complex Chemical Spaces

📅 2025-12-08

📈 Citations: 0

✨ Influential: 0

career value

249K/year

🤖 AI Summary

To address the poor generalization and out-of-distribution (OOD) transferability of machine-learned interatomic potentials (MLIPs) across broad chemical spaces, this work introduces a bias-aware train-test splitting strategy—first explicitly evaluating MLIP performance in chemically unexplored regions (novel compositions, configurations, and topologies). Methodologically, we design an MLIP architecture incorporating long-range electrostatics and van der Waals corrections, coupled with a joint geometry-composition chemical space metric and a rigorous OOD partitioning protocol. Experiments demonstrate that explicit long-range physical modeling is decisive for enhancing transferability; our framework substantially improves prediction accuracy and robustness for complex systems such as metal–organic frameworks. This study establishes a reproducible, verifiable benchmark and design paradigm for universally transferable interatomic potentials.

Technology Category

Application Category

📝 Abstract

The vastness of chemical space makes generalization a central challenge in the development of machine learning interatomic potentials (MLIPs). While MLIPs could enable large-scale atomistic simulations with near-quantum accuracy, their usefulness is often limited by poor transferability to out-of-distribution samples. Here, we systematically evaluate different MLIP architectures with long-range corrections across diverse chemical spaces and show that such schemes are essential, not only for improving in-distribution performance but, more importantly, for enabling significant gains in transferability to unseen regions of chemical space. To enable a more rigorous benchmarking, we introduce biased train-test splitting strategies, which explicitly test the model performance in significantly different regions of chemical space. Together, our findings highlight the importance of long-range modeling for achieving generalizable MLIPs and provide a framework for diagnosing systematic failures across chemical space. Although we demonstrate our methodology on metal-organic frameworks, it is broadly applicable to other materials, offering insights into the design of more robust and transferable MLIPs.

Problem

Research questions and friction points this paper is trying to address.

Evaluating machine learning potentials' generalization across diverse chemical spaces

Addressing poor transferability of interatomic potentials to unseen chemical regions

Developing rigorous benchmarking for systematic failures in chemical space modeling

Innovation

Methods, ideas, or system contributions that make the work stand out.

Long-range corrections enhance MLIP transferability across chemical spaces

Biased train-test splits benchmark model performance in diverse regions

Methodology applicable to various materials for robust potential design

🔎 Similar Papers

Multi-channel learning for integrating structural hierarchies into context-dependent molecular representation