🤖 AI Summary
This work proposes an element-wise nonlinear Mixture-of-Experts (MoE) architecture to enhance the expressiveness of machine learning interatomic potentials (MLIPs) while preserving computational efficiency. By integrating sparse activation, shared expert mechanisms, and an element-aware routing strategy, the approach enables chemically interpretable expert specialization and reveals a striking correspondence between expert assignment and periodic trends in the periodic table. Evaluated on the OMol25, OMat24, and OC20M benchmarks, the model achieves state-of-the-art accuracy, significantly improving the ability of MLIPs to model diverse chemical environments and generalize across complex materials systems.
📝 Abstract
Machine Learning Interatomic Potentials (MLIPs) enable accurate large-scale atomistic simulations, yet improving their expressive capacity efficiently remains challenging. Here we systematically develop Mixture-of-Experts (MoE) and Mixture-of-Linear-Experts (MoLE) architectures for MLIPs and analyze the effects of routing strategies and expert designs. We show that sparse activation combined with shared experts yields substantial performance gains, and that nonlinear MoE formulations outperform MoLE when shared experts are present, underscoring the importance of nonlinear expert specialization. Furthermore, element-wise routing consistently surpasses configuration-level routing, while global MoE routing often leads to numerical instability. The resulting element-wise MoE model achieves state-of-the-art accuracy across the OMol25, OMat24, and OC20M benchmarks. Analysis of routing patterns reveals chemically interpretable expert specialization aligned with periodic-table trends, indicating that the model effectively captures element-specific chemical characteristics for precise interatomic modeling.