SynFlowNet: Design of Diverse and Novel Molecules with Synthesis Constraints

📅 2024-05-02
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Existing generative models for drug design frequently produce molecules that are chemically unsynthesizable. Method: We propose a GFlowNet-based generative framework grounded in forward-synthesis pathway modeling, which—uniquely—explicitly embeds chemical reaction templates and a purchasable reagent library into the action space to enable end-to-end learning of synthetic feasibility constraints. To mitigate reaction-encoding bias and support multi-constraint MDP formulation, we introduce an inverse-policy learning mechanism. Our approach integrates reaction graph encoding, SA Score-guided optimization, and independent retrosynthetic validation. Results: Experiments demonstrate substantial improvements over baselines: generated molecules exhibit significantly higher structural diversity, average SA Score decreases by 12.3%, retrosynthetic success rate increases by 18.7%, and the model reliably infers feasible synthesis pathways for novel molecules.

Technology Category

Application Category

📝 Abstract
Generative models see increasing use in computer-aided drug design. However, while performing well at capturing distributions of molecular motifs, they often produce synthetically inaccessible molecules. To address this, we introduce SynFlowNet, a GFlowNet model whose action space uses chemical reactions and purchasable reactants to sequentially build new molecules. By incorporating forward synthesis as an explicit constraint of the generative mechanism, we aim at bridging the gap between in silico molecular generation and real world synthesis capabilities. We evaluate our approach using synthetic accessibility scores and an independent retrosynthesis tool to assess the synthesizability of our compounds, and motivate the choice of GFlowNets through considerable improvement in sample diversity compared to baselines. Additionally, we identify challenges with reaction encodings that can complicate traversal of the MDP in the backward direction. To address this, we introduce various strategies for learning the GFlowNet backward policy and thus demonstrate how additional constraints can be integrated into the GFlowNet MDP framework. This approach enables our model to successfully identify synthesis pathways for previously unseen molecules.
Problem

Research questions and friction points this paper is trying to address.

Generates diverse molecules with synthesis constraints
Improves synthetic accessibility of designed molecules
Addresses challenges in reaction encodings for GFlowNets
Innovation

Methods, ideas, or system contributions that make the work stand out.

GFlowNet model with chemical reaction action space
Forward synthesis as explicit generative constraint
Learning backward policy for MDP traversal