🤖 AI Summary
To address the challenges of excessive GPU memory consumption due to high-resolution inputs, frequent omission of small lesions, and slow inference caused by multi-pass forward propagation in existing local-global fusion methods for fundus image lesion segmentation, this paper proposes a single-pass high-resolution decoding network. Our method adopts an encoder-decoder architecture that jointly optimizes local detail preservation and global contextual modeling. Key contributions include: (1) a novel high-resolution representation learning module that maintains full-resolution feature streams throughout the entire decoding process; and (2) a multi-scale prediction direct-fusion mechanism that eliminates redundant multi-path computations. Evaluated on the IDRiD and DDR benchmarks, our approach achieves state-of-the-art performance—significantly improving segmentation accuracy while maintaining manageable memory footprint and enabling real-time clinical inference.
📝 Abstract
High resolution is crucial for precise segmentation in fundus images, yet handling high-resolution inputs incurs considerable GPU memory costs, with diminishing performance gains as overhead increases. To address this issue while tackling the challenge of segmenting tiny objects, recent studies have explored local-global fusion methods. These methods preserve fine details using local regions and capture long-range context information from downscaled global images. However, the necessity of multiple forward passes inevitably incurs significant computational overhead, adversely affecting inference speed. In this paper, we propose HRDecoder, a simple High-Resolution Decoder network for fundus lesion segmentation. It integrates a high-resolution representation learning module to capture fine-grained local features and a high-resolution fusion module to fuse multi-scale predictions. Our method effectively improves the overall segmentation accuracy of fundus lesions while consuming reasonable memory and computational overhead, and maintaining satisfying inference speed. Experimental results on the IDRiD and DDR datasets demonstrate the effectiveness of our method. Code is available at https://github.com/CVIU-CSU/HRDecoder.