EvoEGF-Mol: Evolving Exponential Geodesic Flow for Structure-based Drug Design

📅 2026-01-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of conventional structure-based drug design (SBDD) methods, which model atomic coordinates and chemical categories separately in Euclidean and probability spaces, failing to capture the intrinsic statistical manifold of molecular distributions. To overcome this, the authors propose a dynamic evolutionary exponential geodesic flow framework that represents molecules as composite exponential family distributions. Generation proceeds along exponential geodesics under the Fisher–Rao metric, augmented by a dynamic focusing target distribution and a progressive parameter refinement architecture that replaces static Dirac targets, thereby mitigating trajectory collapse. By integrating information geometry with generative modeling—a first in SBDD—the method achieves a 93.4% PoseBusters pass rate on CrossDocked, demonstrating superior geometric accuracy and interaction fidelity. It also outperforms existing approaches on MolGenBench, successfully generating bioactive candidates compliant with medicinal chemistry principles.

Technology Category

Application Category

📝 Abstract
Structure-Based Drug Design (SBDD) aims to discover bioactive ligands. Conventional approaches construct probability paths separately in Euclidean and probabilistic spaces for continuous atomic coordinates and discrete chemical categories, leading to a mismatch with the underlying statistical manifolds. We address this issue from an information-geometric perspective by modeling molecules as composite exponential-family distributions and defining generative flows along exponential geodesics under the Fisher-Rao metric. To avoid the instantaneous trajectory collapse induced by geodesics directly targeting Dirac distributions, we propose Evolving Exponential Geodesic Flow for SBDD (EvoEGF-Mol), which replaces static Dirac targets with dynamically concentrating distributions, ensuring stable training via a progressive-parameter-refinement architecture. Our model approaches a reference-level PoseBusters passing rate (93.4%) on CrossDock, demonstrating remarkable geometric precision and interaction fidelity, while outperforming baselines on real-world MolGenBench tasks by recovering bioactive scaffolds and generating candidates that meet established MedChem filters.
Problem

Research questions and friction points this paper is trying to address.

Structure-Based Drug Design
statistical manifolds
probability paths
exponential-family distributions
Fisher-Rao metric
Innovation

Methods, ideas, or system contributions that make the work stand out.

Exponential Geodesic Flow
Information Geometry
Structure-Based Drug Design
Fisher-Rao Metric
Dynamic Target Distribution
Y
Yaowei Jin
Lingang Laboratory
J
Junjie Wang
Lingang Laboratory, School of Information Science and Technology, ShanghaiTech University
Cheng Cao
Cheng Cao
Applied Science Manager, Amazon AGI
Data MiningInformation RetrievalNatural Language Processing
P
Penglei Wang
Lingang Laboratory
D
Duo An
Lingang Laboratory
Q
Qian Shi
Lingang Laboratory