🤖 AI Summary
Conventional enhanced sampling methods rely on predefined collective variables, hindering automatic identification and modeling of all-atom conformational evolution in proteins over long timescales.
Method: We propose the first all-atom generative model embedded with a time propagator, featuring a unified encoder–propagator–decoder architecture in latent space. It systematically integrates autoregressive neural networks, score-guided Langevin dynamics, and Koopman linear operators within the LD-FPG framework.
Contribution: This work achieves, for the first time, explicit time propagation of protein dynamics in latent space. We quantitatively characterize the trade-off among three propagator types in long-range stability and thermodynamic fidelity: autoregressive propagation yields the highest conformational stability; score-guided dynamics best reconstructs side-chain free-energy landscapes; and Koopman-based propagation provides an interpretable, lightweight baseline. The model accurately recovers free-energy landscapes and conformational distributions, establishing a new data-driven paradigm for protein dynamical modeling.
📝 Abstract
Simulating the long-timescale dynamics of biomolecules is a central challenge in computational science. While enhanced sampling methods can accelerate these simulations, they rely on pre-defined collective variables that are often difficult to identify. A recent generative model, LD-FPG, demonstrated that this problem could be bypassed by learning to sample the static equilibrium ensemble as all-atom deformations from a reference structure, establishing a powerful method for all-atom ensemble generation. However, while this approach successfully captures a system's probable conformations, it does not model the temporal evolution between them. Here we extend LD-FPG with a temporal propagator that operates within the learned latent space and compare three classes: (i) score-guided Langevin dynamics, (ii) Koopman-based linear operators, and (iii) autoregressive neural networks. Within a unified encoder-propagator-decoder framework, we evaluate long-horizon stability, backbone and side-chain ensemble fidelity, and functional free-energy landscapes. Autoregressive neural networks deliver the most robust long rollouts; score-guided Langevin best recovers side-chain thermodynamics when the score is well learned; and Koopman provides an interpretable, lightweight baseline that tends to damp fluctuations. These results clarify the trade-offs among propagators and offer practical guidance for latent-space simulators of all-atom protein dynamics.