Learning to Fuse and Reconstruct Multi-View Graphs for Diabetic Retinopathy Grading

📅 2026-02-25

📈 Citations: 0

✨ Influential: 0

career value

224K/year

🤖 AI Summary

This work addresses the limitation of existing methods in multi-view fundus image fusion, which often neglect inter-view correlations and fail to effectively exploit cross-view consistency. To overcome this, we propose MVGFDR, an end-to-end multi-view graph fusion framework that explicitly decouples shared and view-specific features by constructing a multi-view graph. The framework enhances viewpoint-invariant representation learning through three key components: frequency-domain anchors derived from DCT coefficients, selective node fusion, and a cross-view mask reconstruction mechanism. Extensive experiments on MFIDDR—the largest multi-view fundus dataset to date—demonstrate that MVGFDR significantly outperforms current state-of-the-art approaches, achieving notable improvements in the accuracy of diabetic retinopathy grading.

Technology Category

Application Category

📝 Abstract

Diabetic retinopathy (DR) is one of the leading causes of vision loss worldwide, making early and accurate DR grading critical for timely intervention. Recent clinical practices leverage multi-view fundus images for DR detection with a wide coverage of the field of view (FOV), motivating deep learning methods to explore the potential of multi-view learning for DR grading. However, existing methods often overlook the inter-view correlations when fusing multi-view fundus images, failing to fully exploit the inherent consistency across views originating from the same patient. In this work, we present MVGFDR, an end-to-end Multi-View Graph Fusion framework for DR grading. Different from existing methods that directly fuse visual features from multiple views, MVGFDR is equipped with a novel Multi-View Graph Fusion (MVGF) module to explicitly disentangle the shared and view-specific visual features. Specifically, MVGF comprises three key components: (1) Multi-view Graph Initialization, which constructs visual graphs via residual-guided connections and employs Discrete Cosine Transform (DCT) coefficients as frequency-domain anchors; (2) Multi-view Graph Fusion, which integrates selective nodes across multi-view graphs based on frequency-domain relevance to capture complementary view-specific information; and (3) Masked Cross-view Reconstruction, which leverages masked reconstruction of shared information across views to facilitate view-invariant representation learning. Extensive experimental results on MFIDDR, by far the largest multi-view fundus image dataset, demonstrate the superiority of our proposed approach over existing state-of-the-art approaches in diabetic retinopathy grading.

Problem

Research questions and friction points this paper is trying to address.

Diabetic Retinopathy Grading

Multi-View Fusion

Inter-View Correlation

Fundus Images

View Consistency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-View Graph Fusion

Diabetic Retinopathy Grading

View-Invariant Representation