🤖 AI Summary
To address data imbalance, high heterogeneity, and insufficient model interpretability in diabetic retinopathy (DR) classification, this paper proposes VR-FuseNet—a dual-stream deep network integrating VGG19 and ResNet50V2 features—alongside the first five-source heterogeneous fundus image fusion framework. Preprocessing combines SMOTE oversampling with CLAHE enhancement to improve robustness for minority classes. Explainable AI (XAI) techniques—including Grad-CAM and Score-CAM—are incorporated to visualize pathological attributions of microaneurysms, hemorrhages, and exudates. Evaluated on a mixed heterogeneous dataset, VR-FuseNet achieves 91.82% accuracy, outperforming single-model baselines across sensitivity (92.3%), specificity (91.5%), and F1-score (91.7%). The framework significantly enhances clinical interpretability and cross-dataset generalization capability.
📝 Abstract
Diabetic retinopathy is a severe eye condition caused by diabetes where the retinal blood vessels get damaged and can lead to vision loss and blindness if not treated. Early and accurate detection is key to intervention and stopping the disease progressing. For addressing this disease properly, this paper presents a comprehensive approach for automated diabetic retinopathy detection by proposing a new hybrid deep learning model called VR-FuseNet. Diabetic retinopathy is a major eye disease and leading cause of blindness especially among diabetic patients so accurate and efficient automated detection methods are required. To address the limitations of existing methods including dataset imbalance, diversity and generalization issues this paper presents a hybrid dataset created from five publicly available diabetic retinopathy datasets. Essential preprocessing techniques such as SMOTE for class balancing and CLAHE for image enhancement are applied systematically to the dataset to improve the robustness and generalizability of the dataset. The proposed VR-FuseNet model combines the strengths of two state-of-the-art convolutional neural networks, VGG19 which captures fine-grained spatial features and ResNet50V2 which is known for its deep hierarchical feature extraction. This fusion improves the diagnostic performance and achieves an accuracy of 91.824%. The model outperforms individual architectures on all performance metrics demonstrating the effectiveness of hybrid feature extraction in Diabetic Retinopathy classification tasks. To make the proposed model more clinically useful and interpretable this paper incorporates multiple XAI techniques. These techniques generate visual explanations that clearly indicate the retinal features affecting the model's prediction such as microaneurysms, hemorrhages and exudates so that clinicians can interpret and validate.