🤖 AI Summary
In multi-view diabetic retinopathy (DR) detection, challenges include large inter-lesion scale variations, spatially dispersed lesion distributions, and inadequate modeling of inter-view correlations and redundancies during cross-view feature fusion. To address these, we propose a novel dual-branch multi-view detection framework. Our method introduces a wavelet high-frequency branch to enhance lesion edge details, guided by global semantic information for precise local localization; additionally, we design a cross-view cross-attention module that explicitly models both complementarity and redundancy across views. The framework integrates wavelet decomposition, dual-branch CNNs, cross-view attention, and global–local interaction mechanisms. Evaluated on multiple mainstream public DR datasets, our approach achieves state-of-the-art performance, significantly improving detection sensitivity for micro-lesions and occluded lesions. The source code is publicly available.
📝 Abstract
Multi-view diabetic retinopathy (DR) detection has recently emerged as a promising method to address the issue of incomplete lesions faced by single-view DR. However, it is still challenging due to the variable sizes and scattered locations of lesions. Furthermore, existing multi-view DR methods typically merge multiple views without considering the correlations and redundancies of lesion information across them. Therefore, we propose a novel method to overcome the challenges of difficult lesion information learning and inadequate multi-view fusion. Specifically, we introduce a two-branch network to obtain both local lesion features and their global dependencies. The high-frequency component of the wavelet transform is used to exploit lesion edge information, which is then enhanced by global semantic to facilitate difficult lesion learning. Additionally, we present a cross-view fusion module to improve multi-view fusion and reduce redundancy. Experimental results on large public datasets demonstrate the effectiveness of our method. The code is open sourced on https://github.com/HuYongting/WGLIN.