EPO: Diverse and Realistic Protein Ensemble Generation via Energy Preference Optimization

📅 2025-11-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Accurate sampling of protein conformational ensembles is essential for understanding functional mechanisms, yet conventional molecular dynamics (MD) simulations suffer from insufficient sampling due to prohibitive computational cost and kinetic barriers. To address this, we propose a trajectory-free generative conformational sampling framework grounded in continuous-time generative modeling. Our approach is the first to construct a computable upper bound on long-trajectory probability solely from energy signals—without requiring MD trajectories—and integrates energy-aware ranking with listwise preference optimization to ensure thermodynamic consistency while enabling efficient, physically plausible conformation generation. The method combines stochastic differential equation (SDE)-based sampling, online refinement via pre-trained models, and energy-informed preference optimization. Evaluated on three benchmarks—Tetrapeptides, ATLAS, and Fast-Folding—our method achieves state-of-the-art performance across all nine evaluation metrics, significantly improving both conformational diversity and physical realism.

Technology Category

Application Category

📝 Abstract
Accurate exploration of protein conformational ensembles is essential for uncovering function but remains hard because molecular-dynamics (MD) simulations suffer from high computational costs and energy-barrier trapping. This paper presents Energy Preference Optimization (EPO), an online refinement algorithm that turns a pretrained protein ensemble generator into an energy-aware sampler without extra MD trajectories. Specifically, EPO leverages stochastic differential equation sampling to explore the conformational landscape and incorporates a novel energy-ranking mechanism based on list-wise preference optimization. Crucially, EPO introduces a practical upper bound to efficiently approximate the intractable probability of long sampling trajectories in continuous-time generative models, making it easily adaptable to existing pretrained generators. On Tetrapeptides, ATLAS, and Fast-Folding benchmarks, EPO successfully generates diverse and physically realistic ensembles, establishing a new state-of-the-art in nine evaluation metrics. These results demonstrate that energy-only preference signals can efficiently steer generative models toward thermodynamically consistent conformational ensembles, providing an alternative to long MD simulations and widening the applicability of learned potentials in structural biology and drug discovery.
Problem

Research questions and friction points this paper is trying to address.

Overcoming computational limitations of molecular dynamics for protein conformational exploration
Generating diverse and physically realistic protein ensembles without MD trajectories
Optimizing energy preferences to produce thermodynamically consistent protein structures
Innovation

Methods, ideas, or system contributions that make the work stand out.

Online refinement algorithm for protein ensemble generation
Energy-ranking mechanism using list-wise preference optimization
Practical upper bound for continuous-time generative models
Y
Yuancheng Sun
Institute of Automation, Chinese Academy of Sciences; University of Chinese Academy of Sciences; Beijing Academy of Artificial Intelligence
Yuxuan Ren
Yuxuan Ren
Fudan University
Optical ManipulationNonlinear OpticsMedical Image AnalysisDeep LearningBeam shaping
Z
Zhaoming Chen
Beijing Academy of Artificial Intelligence
X
Xu Han
Institute of Automation, Chinese Academy of Sciences; University of Chinese Academy of Sciences; Beijing Academy of Artificial Intelligence
K
Kang Liu
Institute of Automation, Chinese Academy of Sciences; University of Chinese Academy of Sciences
Qiwei Ye
Qiwei Ye
Beijing Academy of Artificial Intelligence
Scientific AIAI for ScienceFoundation Model