EPO: Diverse and Realistic Protein Ensemble Generation via Energy Preference Optimization

📅 2025-11-13

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

Accurate sampling of protein conformational ensembles is essential for understanding functional mechanisms, yet conventional molecular dynamics (MD) simulations suffer from insufficient sampling due to prohibitive computational cost and kinetic barriers. To address this, we propose a trajectory-free generative conformational sampling framework grounded in continuous-time generative modeling. Our approach is the first to construct a computable upper bound on long-trajectory probability solely from energy signals—without requiring MD trajectories—and integrates energy-aware ranking with listwise preference optimization to ensure thermodynamic consistency while enabling efficient, physically plausible conformation generation. The method combines stochastic differential equation (SDE)-based sampling, online refinement via pre-trained models, and energy-informed preference optimization. Evaluated on three benchmarks—Tetrapeptides, ATLAS, and Fast-Folding—our method achieves state-of-the-art performance across all nine evaluation metrics, significantly improving both conformational diversity and physical realism.

Technology Category

Application Category

📝 Abstract

Accurate exploration of protein conformational ensembles is essential for uncovering function but remains hard because molecular-dynamics (MD) simulations suffer from high computational costs and energy-barrier trapping. This paper presents Energy Preference Optimization (EPO), an online refinement algorithm that turns a pretrained protein ensemble generator into an energy-aware sampler without extra MD trajectories. Specifically, EPO leverages stochastic differential equation sampling to explore the conformational landscape and incorporates a novel energy-ranking mechanism based on list-wise preference optimization. Crucially, EPO introduces a practical upper bound to efficiently approximate the intractable probability of long sampling trajectories in continuous-time generative models, making it easily adaptable to existing pretrained generators. On Tetrapeptides, ATLAS, and Fast-Folding benchmarks, EPO successfully generates diverse and physically realistic ensembles, establishing a new state-of-the-art in nine evaluation metrics. These results demonstrate that energy-only preference signals can efficiently steer generative models toward thermodynamically consistent conformational ensembles, providing an alternative to long MD simulations and widening the applicability of learned potentials in structural biology and drug discovery.

Problem

Research questions and friction points this paper is trying to address.

Overcoming computational limitations of molecular dynamics for protein conformational exploration

Generating diverse and physically realistic protein ensembles without MD trajectories

Optimizing energy preferences to produce thermodynamically consistent protein structures

Innovation

Methods, ideas, or system contributions that make the work stand out.

Online refinement algorithm for protein ensemble generation

Energy-ranking mechanism using list-wise preference optimization

Practical upper bound for continuous-time generative models

🔎 Similar Papers

Diffusion on language model encodings for protein sequence generation