Similarity-Quantized Relative Difference Learning for Improved Molecular Activity Prediction

📅 2025-01-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In molecular activity prediction for drug discovery, conventional models suffer from low accuracy and poor generalization under small-sample and noisy-data conditions. To address this, we propose the Similarity-Quantified Relative Learning (SQRL) paradigm: it reformulates the absolute regression task as relative difference modeling over structurally similar molecular pairs, leveraging precomputed molecular similarities (e.g., ECFP/Tanimoto) to guide graph neural network (GNN) training. SQRL overcomes the limitation of single-molecule independent modeling, substantially enhancing model robustness and data efficiency. Evaluated on multiple public and industrial proprietary datasets, SQRL achieves an average 18.7% reduction in mean absolute error (MAE) and improves AUC by over 0.12 in small-sample tasks (<500 compounds). These results demonstrate SQRL’s superior accuracy, strong generalization capability, and practical utility in real-world drug discovery scenarios.

Technology Category

Application Category

📝 Abstract
Accurate prediction of molecular activities is crucial for efficient drug discovery, yet remains challenging due to limited and noisy datasets. We introduce Similarity-Quantized Relative Learning (SQRL), a learning framework that reformulates molecular activity prediction as relative difference learning between structurally similar pairs of compounds. SQRL uses precomputed molecular similarities to enhance training of graph neural networks and other architectures, and significantly improves accuracy and generalization in low-data regimes common in drug discovery. We demonstrate its broad applicability and real-world potential through benchmarking on public datasets as well as proprietary industry data. Our findings demonstrate that leveraging similarity-aware relative differences provides an effective paradigm for molecular activity prediction.
Problem

Research questions and friction points this paper is trying to address.

Molecular Activity Prediction
Data Limitations
Accuracy Improvement
Innovation

Methods, ideas, or system contributions that make the work stand out.

SQRL
Graph Neural Network
Molecular Activity Prediction
🔎 Similar Papers
No similar papers found.
K
Karina Zadorozhny
Prescient Design, Genentech
K
Kangway V Chuang
Prescient Design, Genentech
B
Bharath Sathappan
Prescient Design, Genentech
E
Ewan Wallace
Prescient Design, Genentech
Vishnu Sresht
Vishnu Sresht
Genentech
C
Colin A. Grambow
Prescient Design, Genentech