🤖 AI Summary
This work addresses the limitations of existing data-driven approaches in multi-step organic synthesis route evaluation, which often oversimplify multi-objective optimization and rely on non-generalizable proxy data, thereby failing to balance feasibility, cost, and efficiency. The authors propose the first interpretable scoring framework that explicitly integrates domain knowledge from chemists. Built upon a DeepSets architecture and incorporating tree edit distance as a structural similarity measure, the model jointly learns regression (for quantitative scores) and multi-class classification (categorizing routes as Good, Plausible, or Bad), refined through expert feedback. The method achieves a Spearman correlation of 0.78 and Pearson correlation of 0.77 in score prediction, along with a Top-1 ranking accuracy of 60.2%—substantially outperforming baseline models by 17.5%—demonstrating both high predictive correlation and interpretability in multi-dimensional synthesis route assessment.
📝 Abstract
Selecting efficient multi-step synthetic routes is a central challenge in organic synthesis, particularly in medicinal and process chemistry, where route choice directly impacts feasibility, cost, and development efficiency. Data-driven assessment systems often oversimplify the multi-objective nature of synthesis design and rely on proxy datasets, such as patent routes, rather than universally grounded criteria. To address this, we introduce an expert-augmented, data-driven scoring framework that integrates machine learning with chemists' domain knowledge for both numerical and explainable route assessment. A DeepSets-based model is trained using tree edit distance between reference and machine-generated routes, and then fine-tuned with expert evaluations to produce both quantitative scores and interpretable qualitative categories: Good, Plausible, and Bad. The resulting system achieves a Spearman correlation coefficient of 0.78 and a Pearson correlation of 0.77 for category assessment prediction, and 60.2% top-1 ranking accuracy for score prediction, substantially outperforming the previous baseline of 17.5%.