🤖 AI Summary
Current evaluations of adversarial robustness in medical imaging overly rely on attack success rate (ASR), neglecting perturbation magnitude, perceptual quality, and cross-architecture transferability, which leads to incomplete conclusions. This study systematically assesses seven prominent CNN and Vision Transformer models against seven attack methods across four medical imaging datasets under five perturbation budgets, integrating multiple metrics including ASR, PSNR, SSIM, and L2 perturbation norm. For the first time, it reveals a weak correlation between ASR and perceptual/distortion metrics, demonstrating that sole reliance on ASR can misrepresent adversarial risk. The work advocates for a multi-metric evaluation framework to holistically assess the safety of medical AI systems, establishing a more reliable paradigm for robustness evaluation in this critical domain.
📝 Abstract
While deep learning systems are becoming increasingly prevalent in medical image analysis, their vulnerabilities to adversarial perturbations raise serious concerns for clinical deployment. These vulnerability evaluations largely rely on Attack Success Rate (ASR), a binary metric that indicates solely whether an attack is successful. However, the ASR metric does not account for other factors, such as perturbation strength, perceptual image quality, and cross-architecture attack transferability, and therefore, the interpretation is incomplete. This gap requires consideration, as complex, large-scale deep learning systems, including Vision Transformers (ViTs), are increasingly challenging the dominance of Convolutional Neural Networks (CNNs). These architectures learn differently, and it is unclear whether a single metric, e.g., ASR, can effectively capture adversarial behavior. To address this, we perform a systematic empirical study on four medical image datasets: PathMNIST, DermaMNIST, RetinaMNIST, and CheXpert. We evaluate seven models (VGG-16, ResNet-50, DenseNet-121, Inception-v3, DeiT, Swin Transformer, and ViT-B/16) against seven attack methods at five perturbation budgets, measuring ASR, Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index Measure (SSIM), and $L_2$ perturbation magnitude. Our findings show a consistent pattern: perceptual and distortion metrics are strongly associated with one another and exhibit minimal correlation with ASR. This applies to both CNNs and ViTs. The results demonstrate that ASR alone is an inadequate indicator of adversarial robustness and transferability. Consequently, we argue that a thorough assessment of adversarial risk in medical AI necessitates multi-metric frameworks that encompass not only the attack efficacy but also its methodology and associated overheads.