Generative Flows on Synthetic Pathway for Drug Design

📅 2024-10-06
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing generative models often neglect molecular synthesizability, limiting their practical utility in drug discovery. To address this, we propose RxnFlow—a novel framework that extends Generative Flow Networks (GFlowNets) to the synthetic route level for the first time, integrating reaction templates and molecular building blocks to enable controllable and traceable molecule generation. We introduce an adaptive action-space subsampling strategy, enabling efficient training over ultra-large combinatorial spaces (1.2 million building blocks × 71 reactions) and supporting dynamic action-space expansion without retraining. Additionally, we incorporate protein pocket–conditioned constraints to enhance target specificity. Evaluated on CrossDocked2020, RxnFlow generates molecules with a mean AutoDock Vina score of −8.85 kcal/mol and a synthetically feasible rate of 34.8%, substantially outperforming state-of-the-art reaction- and fragment-based generative methods.

Technology Category

Application Category

📝 Abstract
Generative models in drug discovery have recently gained attention as efficient alternatives to brute-force virtual screening. However, most existing models do not account for synthesizability, limiting their practical use in real-world scenarios. In this paper, we propose RxnFlow, which sequentially assembles molecules using predefined molecular building blocks and chemical reaction templates to constrain the synthetic chemical pathway. We then train on this sequential generating process with the objective of generative flow networks (GFlowNets) to generate both highly rewarded and diverse molecules. To mitigate the large action space of synthetic pathways in GFlowNets, we implement a novel action space subsampling method. This enables RxnFlow to learn generative flows over extensive action spaces comprising combinations of 1.2 million building blocks and 71 reaction templates without significant computational overhead. Additionally, RxnFlow can employ modified or expanded action spaces for generation without retraining, allowing for the introduction of additional objectives or the incorporation of newly discovered building blocks. We experimentally demonstrate that RxnFlow outperforms existing reaction-based and fragment-based models in pocket-specific optimization across various target pockets. Furthermore, RxnFlow achieves state-of-the-art performance on CrossDocked2020 for pocket-conditional generation, with an average Vina score of -8.85 kcal/mol and 34.8% synthesizability.
Problem

Research questions and friction points this paper is trying to address.

Generative models lack synthesizability in drug design.
RxnFlow integrates molecular building blocks and reaction templates.
RxnFlow optimizes drug molecules for target-specific effectiveness.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Sequential molecule assembly using predefined blocks
Generative flow networks for diverse molecule generation
Action space subsampling for efficient pathway learning
🔎 Similar Papers
No similar papers found.