DirectMultiStep: Direct Route Generation for Multi-Step Retrosynthesis

πŸ“… 2024-05-22
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 1
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Traditional retrosynthetic planning suffers from exponential search-space explosion and poor generalizability due to its reliance on single-step iterative decomposition. To address this, we propose an end-to-end multi-step synthesis pathway generation paradigm, formulating multi-step retrosynthesis for the first time as a controllable conditional sequence generation taskβ€”enabling hard constraints such as prescribed step count and specified starting materials. Methodologically, we design a Transformer-based seq2seq model that jointly encodes molecular graphs and SMILES sequences, enabling molecule-level conditional pathway prediction. On the PaRoutes benchmark, our approach achieves a Top-1 accuracy 2.2–3.3Γ— higher than state-of-the-art baselines. Moreover, it successfully generates chemically feasible, multi-step routes for numerous unseen FDA-approved drugs, demonstrating substantial improvements in both planning efficiency and cross-molecule generalization.

Technology Category

Application Category

πŸ“ Abstract
Traditional computer-aided synthesis planning (CASP) methods rely on iterative single-step predictions, leading to exponential search space growth that limits efficiency and scalability. We introduce a transformer-based model that directly generates multi-step synthetic routes as a single string by conditionally predicting each molecule based on all preceding ones. The model accommodates specific conditions such as the desired number of steps and starting materials, outperforming state-of-the-art methods on the PaRoutes dataset with a 2.2x improvement in Top-1 accuracy on the n$_1$ test set and a 3.3x improvement on the n$_5$ test set. It also successfully predicts routes for FDA-approved drugs not included in the training data, showcasing its generalization capabilities. While the current suboptimal diversity of the training set may impact performance on less common reaction types, our approach presents a promising direction towards fully automated retrosynthetic planning.
Problem

Research questions and friction points this paper is trying to address.

Complex Chemical Synthesis
Efficiency
Accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

DMS-Flex (Duo) Model
Multi-step Synthesis Planning
Expert Mixture Approach
πŸ”Ž Similar Papers
No similar papers found.
Y
Yu Shee
Yale University
H
Haote Li
Yale University
A
Anton Morgunov
Yale University
Victor Batista
Victor Batista
John Gamble Kirkwood Professor of Chemistry, Yale University
Solar EnergyPhotosynthesisGPCRsCatalysisAllostery