Scalable Machine Learning Force Fields for Macromolecular Systems Through Long-Range Aware Message Passing

📅 2026-01-07

📈 Citations: 0

✨ Influential: 0

career value

219K/year

🤖 AI Summary

This work addresses the limitations of existing machine learning force fields, which typically rely on fixed-cutoff local architectures and struggle to accurately capture long-range non-covalent interactions in large molecular systems, leading to error accumulation with system size. To overcome this, the authors construct MolLR25, a high-accuracy DFT reference dataset comprising systems with up to 1,200 atoms, and introduce E2Former-LSR—a novel equivariant Transformer architecture that integrates an explicit long-range attention mechanism. This approach significantly enhances the modeling of complex protein conformations and long-range interactions while maintaining computational efficiency. The model achieves state-of-the-art accuracy on large molecular systems with stable error scaling and demonstrates up to a 30% speedup compared to purely local models.

Technology Category

Application Category

📝 Abstract

Machine learning force fields (MLFFs) have revolutionized molecular simulations by providing quantum mechanical accuracy at the speed of molecular mechanical computations. However, a fundamental reliance of these models on fixed-cutoff architectures limits their applicability to macromolecular systems where long-range interactions dominate. We demonstrate that this locality constraint causes force prediction errors to scale monotonically with system size, revealing a critical architectural bottleneck. To overcome this, we establish the systematically designed MolLR25 ({Mol}ecules with {L}ong-{R}ange effect) benchmark up to 1200 atoms, generated using high-fidelity DFT, and introduce E2Former-LSR, an equivariant transformer that explicitly integrates long-range attention blocks. E2Former-LSR exhibits stable error scaling, achieves superior fidelity in capturing non-covalent decay, and maintains precision on complex protein conformations. Crucially, its efficient design provides up to 30% speedup compared to purely local models. This work validates the necessity of non-local architectures for generalizable MLFFs, enabling high-fidelity molecular dynamics for large-scale chemical and biological systems.

Problem

Research questions and friction points this paper is trying to address.

machine learning force fields

long-range interactions

macromolecular systems

scalability

non-covalent interactions

Innovation

Methods, ideas, or system contributions that make the work stand out.

long-range interactions

machine learning force fields

equivariant transformer