Conformal Prediction for Uncertainty Estimation in Drug-Target Interaction Prediction

📅 2025-05-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Conventional marginal conformal prediction for drug–target interaction (DTI) prediction overlooks the heterogeneity among drug and protein subpopulations, leading to miscalibrated uncertainty estimates. Method: This paper systematically compares and proposes three clustering-conditional conformal prediction methods, with a core innovation: a non-homogeneous score-driven clustering strategy that jointly leverages feature similarity and nearest-neighbor structure—designed to handle both random and fully unknown drug–protein splits. Contribution/Results: Evaluated on the KIBA dataset, our approach significantly improves confidence interval tightness and subpopulation-wise coverage reliability. Notably, residual-driven clustering maintains robust uncertainty quantification even under sparse or novel DTI scenarios. Empirical results demonstrate that subpopulation-aware conformal prediction achieves both statistical validity and practical interpretability, establishing a new paradigm for trustworthy DTI prediction.

Technology Category

Application Category

📝 Abstract
Accurate drug-target interaction (DTI) prediction with machine learning models is essential for drug discovery. Such models should also provide a credible representation of their uncertainty, but applying classical marginal conformal prediction (CP) in DTI prediction often overlooks variability across drug and protein subgroups. In this work, we analyze three cluster-conditioned CP methods for DTI prediction, and compare them with marginal and group-conditioned CP. Clusterings are obtained via nonconformity scores, feature similarity, and nearest neighbors, respectively. Experiments on the KIBA dataset using four data-splitting strategies show that nonconformity-based clustering yields the tightest intervals and most reliable subgroup coverage, especially in random and fully unseen drug-protein splits. Group-conditioned CP works well when one entity is familiar, but residual-driven clustering provides robust uncertainty estimates even in sparse or novel scenarios. These results highlight the potential of cluster-based CP for improving DTI prediction under uncertainty.
Problem

Research questions and friction points this paper is trying to address.

Estimating uncertainty in drug-target interaction prediction
Addressing variability across drug and protein subgroups
Improving reliability of subgroup coverage in DTI prediction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Cluster-conditioned conformal prediction methods
Nonconformity-based clustering for tighter intervals
Residual-driven clustering for robust uncertainty
🔎 Similar Papers
No similar papers found.
M
Morteza Rakhshaninejad
Department of Data Analysis and Mathematical Modeling, Ghent University
M
Mira Jurgens
Department of Data Analysis and Mathematical Modeling, Ghent University
N
Nicolas Dewolf
Department of Data Analysis and Mathematical Modeling, Ghent University
Willem Waegeman
Willem Waegeman
Ghent University
machine learningdata sciencebioinformatics