🤖 AI Summary
Existing molecular graph generation methods lack a unified framework that simultaneously ensures chemical validity and controllability within atomic-level action spaces. To address this, this work proposes CoMole, a motif-aware graph diffusion foundation model that elevates the decision unit from individual atoms to chemically meaningful motifs. By leveraging pretrained structural priors to guide reinforcement learning for optimizing conditional generation policies, CoMole theoretically mitigates the challenges of high-dimensional action spaces and invalid intermediate states. The approach requires no rule-based correction or post-processing; instead, it achieves cross-task controllable generation by merely adjusting task embeddings while keeping the generator frozen. Evaluated on three heterogeneous benchmarks spanning materials and drug discovery, CoMole achieves state-of-the-art performance across nine controllable generation tasks, reducing MAE by up to 48.2% while maintaining molecular validity above 0.94.
📝 Abstract
Despite the success of foundation models in language and vision, molecular graph generation still lacks a unified framework for heterogeneous design tasks with reliable controllability. While reinforcement learning (RL) offers a natural post-training mechanism for task-specific optimization, applying it to graph generative models is hindered by the vast atom-wise action spaces and chemically invalid intermediate states. We propose \textbf{Co}ntrollable \textbf{Mole}cular Generative Foundation Models (CoMole), built with a unified motif-aware graph diffusion pipeline. By learning a motif-aware graph space, CoMole transfers pretrained structural priors into controllable generation, where RL optimizes conditional reverse policies over chemically meaningful decisions. We theoretically characterize the bottleneck of atom-level RL and justify motif-aware policy optimization. Across three heterogeneous benchmarks spanning materials and drug discovery, CoMole ranks first in controllability on all nine targets, reduces MAE by up to 48.2% relative to the strongest baselines, and maintains validity above 0.94 without rule-based correction or post-hoc filtering. We further show that CoMole transfers controllability to unseen properties by optimizing only task embeddings with the generator frozen, achieving performance competitive with strong task-specific baselines.