đ¤ AI Summary
This study addresses the limited clinical generalizability of deep learning models in medical image classificationâstemming from violations of the i.i.d. assumption and opaque decision-making. Focusing on fracture detection, it systematically investigates the synergistic relationship between adversarial robustness and clinical interpretability. We propose a unified framework integrating adversarial training with Grad-CAM-based visualization, and quantitatively evaluate spatial alignment between model-generated saliency maps and orthopedic cliniciansâ annotations of fracture regions. Our empirical analysisâfirst of its kindâdemonstrates that enhancing adversarial robustness significantly improves model focus on anatomically critical regions (p < 0.01), thereby increasing explanationâannotation alignment. This improved alignment directly enhances clinician trust and supports humanâAI collaborative decision-making. Crucially, we unify adversarial robustness and clinical interpretability as dual, complementary safety criteria for trustworthy AI deployment in clinical settings, establishing a novel paradigm for the responsible translation of AI into medical practice.
đ Abstract
Deep neural networks for medical image classification often fail to generalize consistently in clinical practice due to violations of the i.i.d. assumption and opaque decision-making. This paper examines interpretability in deep neural networks fine-tuned for fracture detection by evaluating model performance against adversarial attack and comparing interpretability methods to fracture regions annotated by an orthopedic surgeon. Our findings prove that robust models yield explanations more aligned with clinically meaningful areas, indicating that robustness encourages anatomically relevant feature prioritization. We emphasize the value of interpretability for facilitating human-AI collaboration, in which models serve as assistants under a human-in-the-loop paradigm: clinically plausible explanations foster trust, enable error correction, and discourage reliance on AI for high-stakes decisions. This paper investigates robustness and interpretability as complementary benchmarks for bridging the gap between benchmark performance and safe, actionable clinical deployment.