🤖 AI Summary
Deep learning models for fundus image analysis suffer from low reliability and difficulty in quantifying predictive uncertainty. Method: This paper proposes a novel framework integrating generative flow networks (GFlowNets) with structured uncertainty modeling. It introduces the first application of GFlowNets to learn an optimizable posterior distribution over discrete Dropout masks. We design GFlowOut—a mechanism that jointly leverages ResNet-18 and Vision Transformer (ViT) to enable multi-granularity uncertainty estimation, supporting diverse Dropout strategies (none, random, bottom-up, top-down). Results: On diabetic retinopathy and glaucoma early detection tasks, our method significantly improves classification accuracy over conventional Dropout baselines. Grad-CAM visualizations confirm that the model consistently attends to clinically relevant anatomical regions, demonstrating superior accuracy, robustness, and interpretability.
📝 Abstract
Ocular diseases, including diabetic retinopathy and glaucoma, present a significant public health challenge due to their high prevalence and potential for causing vision impairment. Early and accurate diagnosis is crucial for effective treatment and management. In recent years, deep learning models have emerged as powerful tools for analysing medical images, such as retina imaging. However, challenges persist in model relibability and uncertainty estimation, which are critical for clinical decision-making. This study leverages the probabilistic framework of Generative Flow Networks (GFlowNets) to learn the posterior distribution over latent discrete dropout masks for the classification and analysis of ocular diseases using fundus images. We develop a robust and generalizable method that utilizes GFlowOut integrated with ResNet18 and ViT models as the backbone in identifying various ocular conditions. This study employs a unique set of dropout masks - none, random, bottomup, and topdown - to enhance model performance in analyzing these fundus images. Our results demonstrate that our learnable probablistic latents significantly improves accuracy, outperforming the traditional dropout approach. We utilize a gradient map calculation method, Grad-CAM, to assess model explainability, observing that the model accurately focuses on critical image regions for predictions. The integration of GFlowOut in neural networks presents a promising advancement in the automated diagnosis of ocular diseases, with implications for improving clinical workflows and patient outcomes.