🤖 AI Summary
This study addresses the limited clinical generalizability of deep learning models in medical image classification—stemming from violations of the i.i.d. assumption and opaque decision-making. Focusing on fracture detection, it systematically investigates the synergistic relationship between adversarial robustness and clinical interpretability. We propose a unified framework integrating adversarial training with Grad-CAM-based visualization, and quantitatively evaluate spatial alignment between model-generated saliency maps and orthopedic clinicians’ annotations of fracture regions. Our empirical analysis—first of its kind—demonstrates that enhancing adversarial robustness significantly improves model focus on anatomically critical regions (p < 0.01), thereby increasing explanation–annotation alignment. This improved alignment directly enhances clinician trust and supports human–AI collaborative decision-making. Crucially, we unify adversarial robustness and clinical interpretability as dual, complementary safety criteria for trustworthy AI deployment in clinical settings, establishing a novel paradigm for the responsible translation of AI into medical practice.
📝 Abstract
Deep neural networks for medical image classification often fail to generalize consistently in clinical practice due to violations of the i.i.d. assumption and opaque decision-making. This paper examines interpretability in deep neural networks fine-tuned for fracture detection by evaluating model performance against adversarial attack and comparing interpretability methods to fracture regions annotated by an orthopedic surgeon. Our findings prove that robust models yield explanations more aligned with clinically meaningful areas, indicating that robustness encourages anatomically relevant feature prioritization. We emphasize the value of interpretability for facilitating human-AI collaboration, in which models serve as assistants under a human-in-the-loop paradigm: clinically plausible explanations foster trust, enable error correction, and discourage reliance on AI for high-stakes decisions. This paper investigates robustness and interpretability as complementary benchmarks for bridging the gap between benchmark performance and safe, actionable clinical deployment.