🤖 AI Summary
Existing coarse-grained machine learning potentials (CG MLPs) rely on equilibrium Boltzmann sampling, suffering from slow convergence and inadequate sampling of transition states. This work introduces enhanced sampling techniques—specifically, biasing along coarse-grained collective variables—into the CG force-matching framework for the first time. Biased simulations generate targeted configurations, while unbiased force reconstruction ensures thermodynamic consistency. The method preserves physical fidelity of the potential energy surface while substantially improving conformational space coverage and training efficiency. Validation on the Müller-Brown potential and alanine dipeptide demonstrates superior sampling quality, enhanced accuracy in force and energy predictions, and a marked reduction in simulation time required for convergence compared to conventional approaches. By synergizing enhanced sampling with rigorous force-matching, this work establishes a new paradigm for efficient and reliable training of CG MLPs.
📝 Abstract
Coarse-graining (CG) enables molecular dynamics (MD) simulations of larger systems and longer timescales that are otherwise infeasible with atomistic models. Machine learning potentials (MLPs), with their capacity to capture many-body interactions, can provide accurate approximations of the potential of mean force (PMF) in CG models. Current CG MLPs are typically trained in a bottom-up manner via force matching, which in practice relies on configurations sampled from the unbiased equilibrium Boltzmann distribution to ensure thermodynamic consistency. This convention poses two key limitations: first, sufficiently long atomistic trajectories are needed to reach convergence; and second, even once equilibrated, transition regions remain poorly sampled. To address these issues, we employ enhanced sampling to bias along CG degrees of freedom for data generation, and then recompute the forces with respect to the unbiased potential. This strategy simultaneously shortens the simulation time required to produce equilibrated data and enriches sampling in transition regions, while preserving the correct PMF. We demonstrate its effectiveness on the Müller-Brown potential and capped alanine, achieving notable improvements. Our findings support the use of enhanced sampling for force matching as a promising direction to improve the accuracy and reliability of CG MLPs.