🤖 AI Summary
To address the challenges of complex anatomical structures in fundus images and low localization accuracy under weak supervision, this paper proposes Hierarchical Salient Patch Identification (HSPI), a method enabling interpretable lesion localization using only image-level labels and a standard classifier. HSPI introduces three key innovations: (1) a hierarchical framework that integrates multi-scale feature response analysis for robust saliency detection; (2) a conditional peak-focusing mechanism to enhance precise localization of discriminative regions; and (3) an intersection-driven pseudo-label filtering strategy with multiple patch sizes to mitigate both imprecise and incomplete localization. Evaluated on multiple retinal datasets, HSPI achieves state-of-the-art performance in IoU, Dice, and Top-1 Localization accuracy. Ablation studies confirm the individual efficacy and synergistic contributions of all components.
📝 Abstract
With the widespread application of deep learning technology in medical image analysis, the effective explanation of model predictions and improvement of diagnostic accuracy have become urgent problems that need to be solved. Attribution methods have become key tools to help doctors better understand the diagnostic basis of models, and are used to explain and localize diseases in medical images. However, previous methods suffer from inaccurate and incomplete localization problems for fundus diseases with complex and diverse structures. To solve these problems, we propose a weakly supervised interpretable fundus disease localization method called hierarchical salient patch identification (HSPI) that can achieve interpretable disease localization using only image-level labels and a neural network classifier (NNC). First, we propose salient patch identification (SPI), which divides the image into several patches and optimizes consistency loss to identify which patch in the input image is most important for the network's prediction, in order to locate the disease. Second, we propose a hierarchical identification strategy to force SPI to analyze the importance of different areas to neural network classifier's prediction to comprehensively locate disease areas. Conditional peak focusing is then introduced to ensure that the mask vector can accurately locate the disease area. Finally, we propose patch selection based on multi-sized intersections to filter out incorrectly or additionally identified non-disease regions. We conduct disease localization experiments on fundus image datasets and achieve the best performance on multiple evaluation metrics compared to previous interpretable attribution methods. Additional ablation studies are conducted to verify the effectiveness of each method.