🤖 AI Summary
Existing implicit-solvent machine learning models rely solely on force matching, introducing arbitrary constant offsets in predicted solvation free energies and precluding absolute free energy comparisons across molecules. To address this, we propose LSNN, a graph neural network that jointly models force matching and derivative matching with respect to alchemical coupling parameters (λ), enabling end-to-end multi-task training. Trained on a dataset of 300,000 small molecules, LSNN achieves explicit-solvent alchemical simulation accuracy for solvation free energy prediction (RMSE ≈ 0.5 kcal/mol) while accelerating computation by three to four orders of magnitude. Crucially, LSNN overcomes the fundamental limitation of conventional implicit-solvent models—which are restricted to relative free energy estimation—by enabling high-accuracy absolute free energy predictions. This advancement establishes a new, efficient, and reliable paradigm for large-scale thermodynamic screening in drug discovery.
📝 Abstract
The implicit solvent approach offers a computationally efficient framework to model solvation effects in molecular simulations. However, its accuracy often falls short compared to explicit solvent models, limiting its use in precise thermodynamic calculations. Recent advancements in machine learning (ML) present an opportunity to overcome these limitations by leveraging neural networks to develop more precise implicit solvent potentials for diverse applications. A major drawback of current ML-based methods is their reliance on force-matching alone, which can lead to energy predictions that differ by an arbitrary constant and are therefore unsuitable for absolute free energy comparisons. Here, we introduce a novel methodology with a graph neural network (GNN)-based implicit solvent model, dubbed Lambda Solvation Neural Network (LSNN). In addition to force-matching, this network was trained to match the derivatives of alchemical variables, ensuring that solvation free energies can be meaningfully compared across chemical species.. Trained on a dataset of approximately 300,000 small molecules, LSNN achieves free energy predictions with accuracy comparable to explicit-solvent alchemical simulations, while offering a computational speedup and establishing a foundational framework for future applications in drug discovery.