Can Molecular Foundation Models Know What They Don't Know? A Simple Remedy with Preference Optimization

📅 2025-09-29
📈 Citations: 0
Influential: 0
📄 PDF

career value

216K/year
🤖 AI Summary
Molecular foundation models often yield high-confidence erroneous predictions (chemical hallucinations) for out-of-distribution (OOD) molecules in high-stakes applications such as drug discovery, severely compromising reliability. To address this, we propose Mole-PAIR, the first framework that formulates OOD detection as a preference optimization problem: it learns pairwise affinity rankings between in-distribution (ID) and OOD samples to directly optimize the AUROC objective. Our approach employs a low-cost post-training strategy, enabling plug-and-play integration with existing molecular models. Evaluated on five real-world molecular datasets, Mole-PAIR achieves substantial improvements in OOD detection performance—up to +45.8% AUROC under size shift, +43.9% under scaffold shift, and +24.3% under experimental shift. This work establishes a trustworthy confidence-calibration paradigm for safety-critical molecular AI applications.

Technology Category

Application Category

📝 Abstract
Molecular foundation models are rapidly advancing scientific discovery, but their unreliability on out-of-distribution (OOD) samples severely limits their application in high-stakes domains such as drug discovery and protein design. A critical failure mode is chemical hallucination, where models make high-confidence yet entirely incorrect predictions for unknown molecules. To address this challenge, we introduce Molecular Preference-Aligned Instance Ranking (Mole-PAIR), a simple, plug-and-play module that can be flexibly integrated with existing foundation models to improve their reliability on OOD data through cost-effective post-training. Specifically, our method formulates the OOD detection problem as a preference optimization over the estimated OOD affinity between in-distribution (ID) and OOD samples, achieving this goal through a pairwise learning objective. We show that this objective essentially optimizes AUROC, which measures how consistently ID and OOD samples are ranked by the model. Extensive experiments across five real-world molecular datasets demonstrate that our approach significantly improves the OOD detection capabilities of existing molecular foundation models, achieving up to 45.8%, 43.9%, and 24.3% improvements in AUROC under distribution shifts of size, scaffold, and assay, respectively.
Problem

Research questions and friction points this paper is trying to address.

Addressing unreliable predictions of molecular foundation models on out-of-distribution samples
Reducing chemical hallucination where models make incorrect high-confidence predictions
Improving OOD detection through preference optimization and pairwise learning objectives
Innovation

Methods, ideas, or system contributions that make the work stand out.

Plug-and-play module for molecular foundation models
Preference optimization for OOD detection
Pairwise learning objective optimizing AUROC
💼 Related Jobs
Postdoctoral Fellow – AI-Driven Multi-Omics Integration for Predictive Toxicology
Pfizer
The annual base salary for this position ranges from $64,600.00 to $107,600.00. In addition, this position is eligible for participation in Pfizer’s Global Performance Plan with a bonus target of 7.5% of the base salary. We offer comprehensive and generous benefits and programs to help our colleagues lead healthy lives and to support each of life’s moments. Benefits offered include a 401(k) plan with Pfizer Matching Contributions and an additional Pfizer Retirement Savings Contribution, paid vacation, holiday and personal days, paid caregiver/parental and medical leave, and health benefits to include medical, prescription drug, dental and vision coverage. Learn more at Pfizer Candidate Site – U.S. Benefits | (uscandidates.mypfizerbenefits.com). Pfizer compensation structures and benefit packages are aligned based on the location of hire. The United States salary range provided does not apply to Tampa, FL or any location outside of the United States. Relocation assistance may be available based on business needs and/or eligibility.
Hybrid