🤖 AI Summary
This work addresses the limitations of current machine learning interatomic potentials in accurately capturing long-range electrostatics, charge transfer, and variable spin states—particularly in systems dominated by non-covalent interactions or large biomolecules. By integrating an explicit polarizable electrostatic model into the MACE architecture, the proposed method combines non-self-consistent iterative polarization, a learnable Fukui function, and a global charge equilibration scheme to enable efficient and physically consistent modeling of both charge distribution and spin states. The resulting potential supports external field responses and spin-resolved interpretability. Trained on the OMol25 dataset, it achieves near-hybrid-DFT accuracy across diverse tasks, including thermochemistry, reaction barriers, and transition metal complexes. Notably, it predicts crystal formation energies in the X23-DMC benchmark with sub-kcal/mol error and improves protein–ligand interaction prediction accuracy by fourfold over short-range models.
📝 Abstract
Accurate modelling of electrostatic interactions and charge transfer is fundamental to computational chemistry, yet most machine learning interatomic potentials (MLIPs) rely on local atomic descriptors that cannot capture long-range electrostatic effects. We present a new electrostatic foundation model for molecular chemistry that extends the MACE architecture with explicit treatment of long-range interactions and electrostatic induction. Our approach combines local many-body geometric features with a non-self-consistent field formalism that updates learnable charge and spin densities through polarisable iterations to model induction, followed by global charge equilibration via learnable Fukui functions to control total charge and total spin. This design enables an accurate and physical description of systems with varying charge and spin states while maintaining computational efficiency. Trained on the OMol25 dataset of 100 million hybrid DFT calculations, our models achieve chemical accuracy across diverse benchmarks, with accuracy competitive with hybrid DFT on thermochemistry, reaction barriers, conformational energies, and transition metal complexes. Notably, we demonstrate that the inclusion of long-range electrostatics leads to a large improvement in the description of non-covalent interactions and supramolecular complexes over non-electrostatic models, including sub-kcal/mol prediction of molecular crystal formation energy in the X23-DMC dataset and a fourfold improvement over short-ranged models on protein-ligand interactions. The model's ability to handle variable charge and spin states, respond to external fields, provide interpretable spin-resolved charge densities, and maintain accuracy from small molecules to protein-ligand complexes positions it as a versatile tool for computational molecular chemistry and drug discovery.