🤖 AI Summary
Macrocyclic compounds have long posed a challenge for molecular generative models due to data scarcity and their complex topological structures. This work proposes MacroGuide, the first approach to integrate topological data analysis into molecular generation by leveraging persistent homology to dynamically guide the construction of Vietoris–Rips complexes during the diffusion denoising process. This strategy explicitly optimizes cyclic topological features, enabling efficient generation of macrocyclic structures in both unconditional and protein pocket–conditioned settings. By overcoming the longstanding limitation of conventional generative models in imposing cyclic topological constraints, MacroGuide dramatically increases the macrocycle generation rate from 1% to 99%, while achieving or surpassing state-of-the-art performance in chemical validity, diversity, and PoseBusters benchmarks.
📝 Abstract
Macrocycles are ring-shaped molecules that offer a promising alternative to small-molecule drugs due to their enhanced selectivity and binding affinity against difficult targets. Despite their chemical value, they remain underexplored in generative modeling, likely owing to their scarcity in public datasets and the challenges of enforcing topological constraints in standard deep generative models. We introduce MacroGuide: Topological Guidance for Macrocycle Generation, a diffusion guidance mechanism that uses Persistent Homology to steer the sampling of pretrained molecular generative models toward the generation of macrocycles, in both unconditional and conditional (protein pocket) settings. At each denoising step, MacroGuide constructs a Vietoris-Rips complex from atomic positions and promotes ring formation by optimizing persistent homology features. Empirically, applying MacroGuide to pretrained diffusion models increases macrocycle generation rates from 1% to 99%, while matching or exceeding state-of-the-art performance on key quality metrics such as chemical validity, diversity, and PoseBusters checks.