🤖 AI Summary
This study addresses the challenges of false positives and false negatives in polyp segmentation from colonoscopy images, which arise from the high morphological variability of polyps, their visual similarity to normal tissue, and difficulties in multi-scale detection. Inspired by the human visual system, the authors propose a novel bidirectional segmentation architecture that integrates guided attention, multi-scale retinal processing, and cortical feedback mechanisms. For the first time in medical image segmentation, biological vision principles—including directed attention, parallel retinal pathways, and predictive coding feedback—are incorporated through four key components: a Guided Asymmetric Attention Module (GAAM), a Multi-Scale Retinal Module (MSRM), a Guided Cortical Attention Feedback Module (GCAFM), and a Resolution-Adaptive Polyp Encoder-Decoder Module (PEDM). Evaluated on five public datasets, the method achieves a 3–8% improvement in Dice coefficient and 10–20% enhancement in generalization, significantly reducing both over- and under-segmentation while improving model interpretability and clinical trustworthiness.
📝 Abstract
Accurate polyp segmentation in colonoscopy is essential for cancer prevention but remains challenging due to: (1) high morphological variability (from flat to protruding lesions), (2) strong visual similarity to normal structures such as folds and vessels, and (3) the need for robust multi-scale detection. Existing deep learning approaches suffer from unidirectional processing, weak multi-scale fusion, and the absence of anatomical constraints, often leading to false positives (over-segmentation of normal structures) and false negatives (missed subtle flat lesions). We propose GRAFNet, a biologically inspired architecture that emulates the hierarchical organisation of the human visual system. GRAFNet integrates three key modules: (1) a Guided Asymmetric Attention Module (GAAM) that mimics orientation-tuned cortical neurones to emphasise polyp boundaries, (2) a MultiScale Retinal Module (MSRM) that replicates retinal ganglion cell pathways for parallel multi-feature analysis, and (3) a Guided Cortical Attention Feedback Module (GCAFM) that applies predictive coding for iterative refinement. These are unified in a Polyp Encoder-Decoder Module (PEDM) that enforces spatial-semantic consistency via resolution-adaptive feedback. Extensive experiments on five public benchmarks (Kvasir-SEG, CVC-300, CVC-ColonDB, CVC-Clinic, and PolypGen) demonstrate consistent state-of-the-art performance, with 3-8% Dice improvements and 10-20% higher generalisation over leading methods, while offering interpretable decision pathways. This work establishes a paradigm in which neural computation principles bridge the gap between AI accuracy and clinically trustworthy reasoning. Code is available at https://github.com/afofanah/GRAFNet.