🤖 AI Summary
This work addresses a critical limitation in existing deep learning models for drug–target interaction (DTI) prediction, which often rely on spurious correlations rather than mechanistically meaningful molecular features—a flaw obscured by conventional accuracy metrics. To tackle this, the authors propose ISAAC, a novel framework that introduces, for the first time, an intervention-based structural sensitivity auditing method to evaluate the post-hoc causal reasoning capabilities of frozen DTI models without reference to predictive performance. Combining matched interventions on mechanistic versus spurious inputs, sequence perturbation operators, and multi-seed validation, ISAAC was applied to three state-of-the-art DTI models on the Davis benchmark. The results reveal up to a 25% gap in causal reasoning scores—substantially exceeding the ~3% difference in AUROC—and demonstrate high stability, effectively uncovering blind spots left by traditional evaluation protocols.
📝 Abstract
Deep learning models for drug--target interaction (DTI) prediction often achieve strong benchmark performance without necessarily relying on mechanistically meaningful molecular features, a limitation that standard accuracy-based evaluation cannot detect. We introduce ISAAC (Intervention-based Structural Auditing Approach for Causal Reasoning), a post-hoc framework that evaluates prior-relative structural sensitivity by probing frozen models through matched mechanistic and spurious input-level interventions, independently of predictive accuracy. Applied to three sequence-based DTI architectures on the Davis benchmark, ISAAC reveals approximately 25\% relative differences in reasoning scores across models with comparable AUROC (within around 3\%), stable across training and intervention seeds and two distinct perturbation operators. These discrepancies, undetectable under conventional accuracy metrics, motivate the use of post-hoc structural auditing as a complement to standard performance evaluation in scientific machine learning for molecular modeling.