Synthesizable Molecular Generation via Soft-constrained GFlowNets with Rich Chemical Priors

📅 2026-02-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the poor synthetic accessibility commonly observed in de novo molecular generation by proposing S3-GFN, a novel approach that replaces rigid template-based constraints with soft regularization. Within a sequential GFlowNet framework, S3-GFN integrates chemical priors, off-policy replay, and contrastive learning signals to guide the generation of high-reward SMILES molecules that are both synthetically feasible and structurally diverse. Experimental results across multiple benchmark tasks demonstrate that the method consistently generates molecules with synthetic accessibility exceeding 95%, while simultaneously achieving higher optimization rewards compared to existing approaches. These findings underscore the effectiveness and generalization capability of S3-GFN in balancing reward maximization with practical synthesizability in molecular design.

Technology Category

Application Category

📝 Abstract
The application of generative models for experimental drug discovery campaigns is severely limited by the difficulty of designing molecules de novo that can be synthesized in practice. Previous works have leveraged Generative Flow Networks (GFlowNets) to impose hard synthesizability constraints through the design of state and action spaces based on predefined reaction templates and building blocks. Despite the promising prospects of this approach, it currently lacks flexibility and scalability. As an alternative, we propose S3-GFN, which generates synthesizable SMILES molecules via simple soft regularization of a sequence-based GFlowNet. Our approach leverages rich molecular priors learned from large-scale SMILES corpora to steer molecular generation towards high-reward, synthesizable chemical spaces. The model induces constraints through off-policy replay training with a contrastive learning signal based on separate buffers of synthesizable and unsynthesizable samples. Our experiments show that S3-GFN learns to generate synthesizable molecules ($\geq 95\%$) with higher rewards in diverse tasks.
Problem

Research questions and friction points this paper is trying to address.

molecular generation
synthesizability
drug discovery
generative models
chemical synthesis
Innovation

Methods, ideas, or system contributions that make the work stand out.

GFlowNets
soft constraints
synthesizable molecule generation
contrastive learning
SMILES