🤖 AI Summary
Early diagnosis of Mpox (monkeypox) skin lesions remains challenging due to their clinical similarity to other dermatoses. To address this, we propose a multi-model comparative framework integrating transfer learning and eXplainable AI (XAI). We systematically evaluate pre-trained CNNs—including VGG16, VGG19, InceptionV3, and MobileNetV2—using layer freezing, custom classification heads, and Grad-CAM for visual interpretability. Performance is assessed on both binary classification (Mpox vs. non-Mpox) and fine-grained multiclass classification (including varicella, herpes zoster, etc.). InceptionV3 achieves 95% accuracy in binary classification, while MobileNetV2 attains 93% accuracy in multiclass classification. Grad-CAM successfully localizes lesion-relevant regions, enhancing decision transparency and clinical trustworthiness. This work establishes a reproducible, interpretable, and clinically grounded paradigm for AI-assisted diagnosis of cutaneous infectious diseases.
📝 Abstract
Context: Mpox is a zoonotic disease caused by the Mpox virus, which shares similarities with other skin conditions, making accurate early diagnosis challenging. Artificial intelligence (AI), especially Deep Learning (DL), has a strong tool for medical image analysis; however, pre-trained models like CNNs and XAI techniques for mpox detection is underexplored. Objective: This study aims to evaluate the effectiveness of pre-trained CNN models (VGG16, VGG19, InceptionV3, MobileNetV2) for the early detection of monkeypox using binary and multi-class datasets. It also seeks to enhance model interpretability using Grad-CAM an XAI technique. Method: Two datasets, MSLD and MSLD v2.0, were used for training and validation. Transfer learning techniques were applied to fine-tune pre-trained CNN models by freezing initial layers and adding custom layers for adapting the final features for mpox detection task and avoid overfitting. Models performance were evaluated using metrics such as accuracy, precision, recall, F1-score and ROC. Grad-CAM was utilized for visualizing critical features. Results: InceptionV3 demonstrated the best performance on the binary dataset with an accuracy of 95%, while MobileNetV2 outperformed on the multi-class dataset with an accuracy of 93%. Grad-CAM successfully highlighted key image regions. Despite high accuracy, some models showed overfitting tendencies, as videnced by discrepancies between training and validation losses. Conclusion: This study underscores the potential of pre-trained CNN models in monkeypox detection and the value of XAI techniques. Future work should address dataset limitations, incorporate multimodal data, and explore additional interpretability techniques to improve diagnostic reliability and model transparency