Margin-calibrated Classifier Guidance for Property-driven Synthesis Planning

📅 2026-05-13
📈 Citations: 0
Influential: 0
📄 PDF

career value

180K/year
🤖 AI Summary
Traditional classifiers struggle to effectively guide autoregressive retrosynthesis models toward satisfying specific chemical properties or chemist preferences when trained on sparse, single-breakpoint reaction data. This work proposes a Sequence Completion Ranking (SCR) approach that, for the first time, incorporates margin calibration into classifier-based guidance by integrating contrastive learning with a margin-based loss function, substantially enhancing the classifier’s ability to discriminate among reaction pathways during decoding. The method expands the set of viable synthetic sequences accessible through guided beam search, bridging the diversity gap between template-free and template-based approaches. On the USPTO-190 benchmark, multi-step synthesis success rates improve dramatically—from 16.8% to 78.4% under reaction-type guidance and to 95.3% with Tanimoto-based guidance—enabling feasible routes to be identified for 33 target molecules previously deemed unsolvable.
📝 Abstract
Synthesis planning seeks an efficient sequence of chemical reactions that produce a target molecule. Typically, a pretrained single-step (autoregressive) retrosynthesis model is repeatedly invoked to generate such a sequence. Classifier guidance can, in principle, help steer the output of single-step model toward reactions that satisfy specific constraints or accommodate chemist's preferences during inference without having to retrain the autoregressive generator. We expose the insufficiency of auxiliary classifiers trained with cross-entropy loss to override the unconditional token-level distributions learned from typical sparse single-disconnection reaction datasets. We overcome this issue with a novel method called Sequence Completion Ranking (SCR), which employs contrastive argumentation and a margin-based loss to calibrate the classifier so that it can meaningfully discriminate between continuations during decoding. We formally establish that margin-calibrated classifiers can expand the set of property-satisfying sequences reachable under guided beam search. Empirically, on USPTO-190, given chemist-specified guidance targets, SCR substantially improves multi-step solve rates from $16.8\%$ (unguided generator) to $78.4\%$ with reaction-type guidance and $95.3\%$ with Tanimoto guidance, unlocking valid routes for 33 targets ($17.4\%$) previously unsolvable with baselines. Our method also effectively closes the long-standing diversity gap between template-free and template-based methods.
Problem

Research questions and friction points this paper is trying to address.

synthesis planning
classifier guidance
property-driven
retrosynthesis
sequence generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Classifier Guidance
Margin Calibration
Sequence Completion Ranking
Retrosynthesis Planning
Contrastive Learning
🔎 Similar Papers
No similar papers found.