Conformal Prediction for Uncertainty Estimation in Drug-Target Interaction Prediction

📅 2025-05-24
📈 Citations: 0
Influential: 0
📄 PDF

career value

186K/year
🤖 AI Summary
Conventional marginal conformal prediction for drug–target interaction (DTI) prediction overlooks the heterogeneity among drug and protein subpopulations, leading to miscalibrated uncertainty estimates. Method: This paper systematically compares and proposes three clustering-conditional conformal prediction methods, with a core innovation: a non-homogeneous score-driven clustering strategy that jointly leverages feature similarity and nearest-neighbor structure—designed to handle both random and fully unknown drug–protein splits. Contribution/Results: Evaluated on the KIBA dataset, our approach significantly improves confidence interval tightness and subpopulation-wise coverage reliability. Notably, residual-driven clustering maintains robust uncertainty quantification even under sparse or novel DTI scenarios. Empirical results demonstrate that subpopulation-aware conformal prediction achieves both statistical validity and practical interpretability, establishing a new paradigm for trustworthy DTI prediction.

Technology Category

Application Category

📝 Abstract
Accurate drug-target interaction (DTI) prediction with machine learning models is essential for drug discovery. Such models should also provide a credible representation of their uncertainty, but applying classical marginal conformal prediction (CP) in DTI prediction often overlooks variability across drug and protein subgroups. In this work, we analyze three cluster-conditioned CP methods for DTI prediction, and compare them with marginal and group-conditioned CP. Clusterings are obtained via nonconformity scores, feature similarity, and nearest neighbors, respectively. Experiments on the KIBA dataset using four data-splitting strategies show that nonconformity-based clustering yields the tightest intervals and most reliable subgroup coverage, especially in random and fully unseen drug-protein splits. Group-conditioned CP works well when one entity is familiar, but residual-driven clustering provides robust uncertainty estimates even in sparse or novel scenarios. These results highlight the potential of cluster-based CP for improving DTI prediction under uncertainty.
Problem

Research questions and friction points this paper is trying to address.

Estimating uncertainty in drug-target interaction prediction
Addressing variability across drug and protein subgroups
Improving reliability of subgroup coverage in DTI prediction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Cluster-conditioned conformal prediction methods
Nonconformity-based clustering for tighter intervals
Residual-driven clustering for robust uncertainty