🤖 AI Summary
Traditional molecular dynamics (MD) simulations are computationally expensive and struggle to access biologically relevant timescales, while existing generative models are hindered by architectural limitations, error accumulation, and inadequate modeling of spatiotemporal dynamics. This work proposes STAR-MD, a scalable SE(3)-equivariant diffusion model that leverages a causal diffusion Transformer with a joint spatiotemporal attention mechanism. By preserving SE(3) equivariance while circumventing memory bottlenecks, STAR-MD enables autoregressive generation of microsecond-scale protein trajectories. Evaluated on the ATLAS benchmark, STAR-MD achieves state-of-the-art performance in conformational coverage, structural validity, and dynamic fidelity, marking the first demonstration of stable, high-quality microsecond-level protein dynamics simulation through generative modeling.
📝 Abstract
Molecular dynamics (MD) simulations remain the gold standard for studying protein dynamics, but their computational cost limits access to biologically relevant timescales. Recent generative models have shown promise in accelerating simulations, yet they struggle with long-horizon generation due to architectural constraints, error accumulation, and inadequate modeling of spatio-temporal dynamics. We present STAR-MD (Spatio-Temporal Autoregressive Rollout for Molecular Dynamics), a scalable SE(3)-equivariant diffusion model that generates physically plausible protein trajectories over microsecond timescales. Our key innovation is a causal diffusion transformer with joint spatio-temporal attention that efficiently captures complex space-time dependencies while avoiding the memory bottlenecks of existing methods. On the standard ATLAS benchmark, STAR-MD achieves state-of-the-art performance across all metrics--substantially improving conformational coverage, structural validity, and dynamic fidelity compared to previous methods. STAR-MD successfully extrapolates to generate stable microsecond-scale trajectories where baseline methods fail catastrophically, maintaining high structural quality throughout the extended rollout. Our comprehensive evaluation reveals severe limitations in current models for long-horizon generation, while demonstrating that STAR-MD's joint spatio-temporal modeling enables robust dynamics simulation at biologically relevant timescales, paving the way for accelerated exploration of protein function.