🤖 AI Summary
Molecular dynamics (MD) simulations are computationally prohibitive for capturing long-timescale thermodynamic processes in biomolecules; meanwhile, existing coarse-grained (CG) machine learning models—particularly force-matching approaches—often fail to reproduce free-energy differences between low-energy states due to gradient-fitting bias, hindering accurate reconstruction of the full thermodynamic landscape. To address this, we propose a multi-objective training framework integrating energy matching: within the CGSchNet architecture, we introduce a free-energy surface (FES) matching term estimated via time-lagged independent component analysis (TICA), jointly optimized with conventional force matching, and systematically modulate the relative weight of the energy loss. Experiments on the Chignolin protein demonstrate that while absolute accuracy is not substantially improved, our method uncovers, for the first time, a critical quantitative relationship between energy-matching weight and FES generalization capability. This provides both an interpretable theoretical foundation and a practical paradigm for designing multimodal loss functions in CG modeling.
📝 Abstract
Molecular dynamics (MD) simulations provide atomistic insight into biomolecular systems but are often limited by high computational costs required to access long timescales. Coarse-grained machine learning models offer a promising avenue for accelerating sampling, yet conventional force matching approaches often fail to capture the full thermodynamic landscape as fitting a model on the gradient may not fit the absolute differences between low-energy conformational states. In this work, we incorporate a complementary energy matching term into the loss function. We evaluate our framework on the Chignolin protein using the CGSchNet model, systematically varying the weight of the energy loss term. While energy matching did not yield statistically significant improvements in accuracy, it revealed distinct tendencies in how models generalize the free energy surface. Our results suggest future opportunities to enhance coarse-grained modeling through improved energy estimation techniques and multi-modal loss formulations.