🤖 AI Summary
Although AI-based skin lesion diagnosis achieves higher accuracy than clinicians, its clinical adoption remains limited due to low trust—primarily stemming from inconsistencies and incomplete class coverage in existing explainability methods (e.g., LIME, CAM). To address this, we propose Global Class Activation Probability maps (GCAP), a pixel-level, multi-class, global explanation technique that enables fine-grained, cross-class interpretability of diagnostic decisions. GCAP is integrated with SafeML, a proactive uncertainty-aware mechanism for high-risk prediction alerting. Our framework combines MobileNetV2 and Vision Transformer (ViT) backbones, jointly optimizing class activation mapping and epistemic uncertainty quantification on the ISIC dataset. Experiments demonstrate that GCAP significantly improves explanation consistency and class discriminability, while SafeML reliably identifies high-uncertainty predictions. Together, they reduce misdiagnosis risk and enhance the clinical credibility and safety of AI-assisted dermatological diagnosis.
📝 Abstract
Recent advancements in skin lesion classification models have significantly improved accuracy, with some models even surpassing dermatologists' diagnostic performance. However, in medical practice, distrust in AI models remains a challenge. Beyond high accuracy, trustworthy, explainable diagnoses are essential. Existing explainability methods have reliability issues, with LIME-based methods suffering from inconsistency, while CAM-based methods failing to consider all classes. To address these limitations, we propose Global Class Activation Probabilistic Map Evaluation, a method that analyses all classes' activation probability maps probabilistically and at a pixel level. By visualizing the diagnostic process in a unified manner, it helps reduce the risk of misdiagnosis. Furthermore, the application of SafeML enhances the detection of false diagnoses and issues warnings to doctors and patients as needed, improving diagnostic reliability and ultimately patient safety. We evaluated our method using the ISIC datasets with MobileNetV2 and Vision Transformers.