🤖 AI Summary
Sampling transition pathways between metastable molecular states under high energy barriers remains challenging: conventional molecular dynamics simulations suffer from low efficiency, while AI-based methods relying on hand-crafted collective variables exhibit poor generalizability. To address this, we propose the unbiased and scalable Diffusion Path Sampler (DPS). Our key contributions are: (1) the first learnable control variate framework based on log-variance divergence, integrated with an off-policy reinforcement learning scheme; (2) synergistic incorporation of a replay buffer and simulated annealing to enhance sample efficiency and pathway diversity; and (3) scale-equivariant biased force parameterization, enabling robust modeling of large-scale systems. Evaluated on synthetic potentials, small peptides, and fast-folding proteins, DPS consistently outperforms baselines—generating physically realistic, diverse, and generalizable transition pathways with significantly improved sampling efficiency.
📝 Abstract
Understanding transition pathways between two meta-stable states of a molecular system is crucial to advance drug discovery and material design. However, unbiased molecular dynamics (MD) simulations are computationally infeasible because of the high energy barriers that separate these states. Although recent machine learning techniques are proposed to sample rare events, they are often limited to simple systems and rely on collective variables (CVs) derived from costly domain expertise. In this paper, we introduce a novel approach that trains diffusion path samplers (DPS) to address the transition path sampling (TPS) problem without requiring CVs. We reformulate the problem as an amortized sampling from the transition path distribution by minimizing the log-variance divergence between the path distribution induced by DPS and the transition path distribution. Based on the log-variance divergence, we propose learnable control variates to reduce the variance of gradient estimators and the off-policy training objective with replay buffers and simulated annealing techniques to improve sample efficiency and diversity. We also propose a scale-based equivariant parameterization of the bias forces to ensure scalability for large systems. We extensively evaluate our approach, termed TPS-DPS, on a synthetic system, small peptide, and challenging fast-folding proteins, demonstrating that it produces more realistic and diverse transition pathways than existing baselines.