🤖 AI Summary
This work proposes a novel approach to reaction prediction by modeling chemical transformations as continuous trajectories in a latent space anchored to thermodynamic products, circumventing the limitations of existing methods that rely on one-step mappings or require mechanistic annotations for discrete stepwise generation. By leveraging conditional flow matching, the model learns time-dependent latent dynamics directly from standard reactant–product pairs without intermediate state labels. This formulation enables trajectory-level diagnostics, precise localization of failure modes, and gated inference correction, while intrinsic uncertainty estimates are derived from the geometric properties of the learned trajectories. Evaluated on USPTO benchmarks, the method achieves state-of-the-art performance and simultaneously provides interpretable reaction pathways, error mitigation capabilities, and an automated mechanism for filtering reliable predictions.
📝 Abstract
Recent advances in reaction prediction have achieved near-saturated accuracy on standard benchmarks (e.g., USPTO), yet most state-of-the-art models formulate the task as a one-shot mapping from reactants to products, offering limited insight into the underlying reaction process. Procedural alternatives introduce stepwise generation but often rely on mechanism-specific supervision, discrete symbolic edits, and computationally expensive inference. In this work, we propose LatentRxnFlow, a new reaction prediction paradigm that models reactions as continuous latent trajectories anchored at the thermodynamic product state. Built on Conditional Flow Matching, our approach learns time-dependent latent dynamics directly from standard reactant-product pairs, without requiring mechanistic annotations or curated intermediate labels. While LatentRxnFlow achieves state-of-the-art performance on USPTO benchmarks, more importantly, the continuous formulation exposes the full generative trajectory, enabling trajectory-level diagnostics that are difficult to realize with discrete or one-shot models. We show that latent trajectory analysis allows us to localize and characterize failure modes and to mitigate certain errors via gated inference. Furthermore, geometric properties of the learned trajectories provide an intrinsic signal of epistemic uncertainty, helping prioritize reliably predictable reaction outcomes and flag ambiguous cases for additional validation. Overall, LatentRxnFlow combines strong predictive accuracy with improved transparency, diagnosability, and uncertainty awareness, moving reaction prediction toward more trustworthy deployment in high-throughput discovery workflows.