Scalable and Cost-Efficient de Novo Template-Based Molecular Generation

📅 2025-06-10
🏛️ arXiv.org
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses three key challenges in template-guided molecular generation: high synthetic cost, difficulty in scaling the building block library, and underutilization of small fragments. We propose a recursive, cost-guided generative framework based on Generative Flow Networks (GFlowNets). Methodologically, we design a backward policy network coupled with an auxiliary synthetic cost predictor, introduce a dynamic building block library that reuses intermediate molecular states, and employ a penalty mechanism to balance exploration and exploitation. Our key contribution lies in explicitly embedding synthetic cost into the generative process, enabling end-to-end differentiable optimization; the dynamic library mechanism markedly improves both diversity and efficiency—especially with limited building block sets. On standard templated molecular generation benchmarks, our approach generates higher-quality, more diverse molecules at lower synthetic cost, achieving state-of-the-art performance.

Technology Category

Application Category

📝 Abstract
Template-based molecular generation offers a promising avenue for drug design by ensuring generated compounds are synthetically accessible through predefined reaction templates and building blocks. In this work, we tackle three core challenges in template-based GFlowNets: (1) minimizing synthesis cost, (2) scaling to large building block libraries, and (3) effectively utilizing small fragment sets. We propose extbf{Recursive Cost Guidance}, a backward policy framework that employs auxiliary machine learning models to approximate synthesis cost and viability. This guidance steers generation toward low-cost synthesis pathways, significantly enhancing cost-efficiency, molecular diversity, and quality, especially when paired with an extbf{Exploitation Penalty} that balances the trade-off between exploration and exploitation. To enhance performance in smaller building block libraries, we develop a extbf{Dynamic Library} mechanism that reuses intermediate high-reward states to construct full synthesis trees. Our approach establishes state-of-the-art results in template-based molecular generation.
Problem

Research questions and friction points this paper is trying to address.

Minimizing synthesis costs in template-based molecular generation
Scaling generation to large building block libraries efficiently
Enhancing performance with limited small fragment sets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Recursive Cost Guidance optimizes synthesis pathways
Dynamic Library reuses intermediate states for efficiency
Exploitation Penalty balances exploration and exploitation trade-off
🔎 Similar Papers
No similar papers found.
P
Piotr Gainski
Jagiellonian University, Faculty of Mathematics and Computer Science
O
Oussama Boussif
Mila – Québec AI Institute, Université de Montréal
A
Andrei Rekesh
University of Toronto
D
Dmytro Shevchuk
University of Toronto, The Hospital for Sick Children Research Institute
A
Alipanah Parviz
Mila – Québec AI Institute, Université de Montréal
M
Mike Tyers
University of Toronto, The Hospital for Sick Children Research Institute
R
Robert A. Batey
University of Toronto, Acceleration Consortium
Michał Koziarski
Michał Koziarski
The Hospital for Sick Children, University of Toronto, Vector Institute