Wavelet-based Global-Local Interaction Network with Cross-Attention for Multi-View Diabetic Retinopathy Detection

📅 2025-03-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In multi-view diabetic retinopathy (DR) detection, challenges include large inter-lesion scale variations, spatially dispersed lesion distributions, and inadequate modeling of inter-view correlations and redundancies during cross-view feature fusion. To address these, we propose a novel dual-branch multi-view detection framework. Our method introduces a wavelet high-frequency branch to enhance lesion edge details, guided by global semantic information for precise local localization; additionally, we design a cross-view cross-attention module that explicitly models both complementarity and redundancy across views. The framework integrates wavelet decomposition, dual-branch CNNs, cross-view attention, and global–local interaction mechanisms. Evaluated on multiple mainstream public DR datasets, our approach achieves state-of-the-art performance, significantly improving detection sensitivity for micro-lesions and occluded lesions. The source code is publicly available.

Technology Category

Application Category

📝 Abstract
Multi-view diabetic retinopathy (DR) detection has recently emerged as a promising method to address the issue of incomplete lesions faced by single-view DR. However, it is still challenging due to the variable sizes and scattered locations of lesions. Furthermore, existing multi-view DR methods typically merge multiple views without considering the correlations and redundancies of lesion information across them. Therefore, we propose a novel method to overcome the challenges of difficult lesion information learning and inadequate multi-view fusion. Specifically, we introduce a two-branch network to obtain both local lesion features and their global dependencies. The high-frequency component of the wavelet transform is used to exploit lesion edge information, which is then enhanced by global semantic to facilitate difficult lesion learning. Additionally, we present a cross-view fusion module to improve multi-view fusion and reduce redundancy. Experimental results on large public datasets demonstrate the effectiveness of our method. The code is open sourced on https://github.com/HuYongting/WGLIN.
Problem

Research questions and friction points this paper is trying to address.

Detecting diabetic retinopathy across multiple views with incomplete lesions
Addressing variable lesion sizes and scattered locations in multi-view DR
Improving multi-view fusion by reducing redundancies and enhancing correlations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Wavelet-based two-branch network for lesion features
Cross-view fusion module reduces information redundancy
Global-local interaction enhances difficult lesion learning
🔎 Similar Papers
No similar papers found.
Y
Yongting Hu
School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Shenzhen, China; Shenzhen Key Laboratory of Visual Object Detection and Recognition, Shenzhen, China
Y
Yuxin Lin
School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Shenzhen, China; Shenzhen Key Laboratory of Visual Object Detection and Recognition, Shenzhen, China
C
Chengliang Liu
School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Shenzhen, China; Shenzhen Key Laboratory of Visual Object Detection and Recognition, Shenzhen, China
Xiaoling Luo
Xiaoling Luo
Shenzhen University; Harbin Institute of Technology, Shenzhen
Medical image processingComputer vision
X
Xiaoyan Dou
Ophthalmology Department, Shenzhen Second People’s Hospital, Shenzhen, China
Qihao Xu
Qihao Xu
Harbin Institute of Technology (Shenzhen)
CV
Y
Yong Xu
School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Shenzhen, China; Shenzhen Key Laboratory of Visual Object Detection and Recognition, Shenzhen, China