🤖 AI Summary
This work addresses the challenge of generative drug design for synthesizable molecules by proposing the first end-to-end model that jointly generates 3D molecular conformations and complete synthetic routes. To simultaneously optimize structural activity and synthetic feasibility, we extend flow matching to a discrete-continuous hybrid space and integrate it with GFlowNets for reward-guided joint sampling. We introduce Compositional Generative Flow (CGFlow), a unified framework that jointly models conformational sampling and retrosynthetic planning. Evaluated on all 15 targets in LIT-PCBA, CGFlow achieves state-of-the-art binding affinity. On CrossDocked, it attains a Vina Dock score of −9.38 and a synthesis success rate of 62.2% under AiZynth, while improving sampling efficiency by 5.8× over baselines.
📝 Abstract
Many generative applications, such as synthesis-based 3D molecular design, involve constructing compositional objects with continuous features. Here, we introduce Compositional Generative Flows (CGFlow), a novel framework that extends flow matching to generate objects in compositional steps while modeling continuous states. Our key insight is that modeling compositional state transitions can be formulated as a straightforward extension of the flow matching interpolation process. We further build upon the theoretical foundations of generative flow networks (GFlowNets), enabling reward-guided sampling of compositional structures. We apply CGFlow to synthesizable drug design by jointly designing the molecule's synthetic pathway with its 3D binding pose. Our approach achieves state-of-the-art binding affinity on all 15 targets from the LIT-PCBA benchmark, and 5.8$ imes$ improvement in sampling efficiency compared to 2D synthesis-based baseline. To our best knowledge, our method is also the first to achieve state of-art-performance in both Vina Dock (-9.38) and AiZynth success rate (62.2%) on the CrossDocked benchmark.