🤖 AI Summary
This work addresses the limited interpretability of black-box convolutional neural networks (CNNs) in high-stakes clinical settings, which hinders their adoption in medical image analysis. To overcome this challenge, the authors propose the TTE-CAM framework, which transforms a pre-trained CNN into a self-explainable model at test time by replacing its classification head with a lightweight convolutional module—without requiring retraining. By reusing the original network weights, TTE-CAM generates intrinsic class activation maps that achieve both high fidelity to the original predictions and high-quality visual explanations. Experimental results demonstrate that TTE-CAM preserves the model’s original predictive performance while delivering explanation quality that matches or exceeds that of state-of-the-art post-hoc interpretation methods, as confirmed through both qualitative and quantitative evaluations.
📝 Abstract
Convolutional neural networks (CNNs) achieve state-of-the-art performance in medical image analysis yet remain opaque, limiting adoption in high-stakes clinical settings. Existing approaches face a fundamental trade-off: post-hoc methods provide unfaithful approximate explanations, while inherently interpretable architectures are faithful but often sacrifice predictive performance. We introduce TTE-CAM, a test-time framework that bridges this gap by converting pretrained black-box CNNs into self-explainable models via a convolution-based replacement of their classification head, initialized from the original weights. The resulting model preserves black-box predictive performance while delivering built-in faithful explanations competitive with post-hoc methods, both qualitatively and quantitatively. The code is available at https://github.com/kdjoumessi/Test-Time-Explainability