🤖 AI Summary
To address the low cross-device parallel inference efficiency and poor adaptability of machine-learned interatomic potentials (MLIPs) in large-scale atomic simulations, this paper proposes the first graph-level distributed architecture tailored for MLIP inference. Unlike conventional spatial domain decomposition, our approach leverages the intrinsic graph structure of graph neural networks (GNNs), integrating zero-redundancy data parallelism (ZeRO) with an adaptive graph partitioning algorithm to enable seamless multi-device deployment of arbitrary MLIP models—without modifying model code. Deployment is achieved via a plug-and-play interface. On an 8-GPU system, our framework achieves sub-second inference for million-atom systems across mainstream MLIPs—including CHGNet, MACE, TensorNet, and eSEN—delivering substantial throughput improvements. The architecture demonstrates unprecedented scalability and usability, setting new benchmarks for distributed MLIP inference.
📝 Abstract
Large-scale atomistic simulations are essential to bridge computational materials and chemistry to realistic materials and drug discovery applications. In the past few years, rapid developments of machine learning interatomic potentials (MLIPs) have offered a solution to scale up quantum mechanical calculations. Parallelizing these interatomic potentials across multiple devices poses a challenging, but promising approach to further extending simulation scales to real-world applications. In this work, we present DistMLIP, an efficient distributed inference platform for MLIPs based on zero-redundancy, graph-level parallelization. In contrast to conventional space-partitioning parallelization, DistMLIP enables efficient MLIP parallelization through graph partitioning, allowing multi-device inference on flexible MLIP model architectures like multi-layer graph neural networks. DistMLIP presents an easy-to-use, flexible, plug-in interface that enables distributed inference of pre-existing MLIPs. We demonstrate DistMLIP on four widely used and state-of-the-art MLIPs: CHGNet, MACE, TensorNet, and eSEN. We show that existing foundational potentials can perform near-million-atom calculations at the scale of a few seconds on 8 GPUs with DistMLIP.