DepthPolyp: Pseudo-Depth Guided Lightweight Segmentation for Real-Time Colonoscopy

📅 2026-05-15
📈 Citations: 0
Influential: 0
📄 PDF

career value

209K/year
🤖 AI Summary
This work addresses the performance degradation in polyp segmentation caused by motion blur, specular reflections, and unstable illumination in real-world colonoscopy scenarios by proposing a lightweight yet robust segmentation framework. The method integrates pseudo-depth-guided multi-task learning with hierarchical Ghost factorization, interleaved shuffle fusion, and dynamic group gating mechanisms, trained explicitly on degraded data to enhance clinical generalization. Despite containing only 3.57 million parameters and 0.86 GMACs, the model outperforms architectures over 20 times larger on the PolypGen dataset of real surgical videos and achieves inference speeds exceeding 180 FPS on mobile devices.
📝 Abstract
Accurate polyp segmentation in colonoscopy is essential for early colorectal cancer detection, yet real-world clinical environments pose persistent challenges such as motion blur, specular reflections, and illumination instability. Most existing methods are optimized on clean benchmark images and suffer noticeable performance degradation when deployed in authentic surgical scenarios. We propose DepthPolyp, a lightweight and robust segmentation framework based on pseudo-depth-guided multi-task learning and efficient feature modulation. The architecture combines hierarchical Ghost factorization for compact feature generation, Interleaved Shuffle Fusion for low-cost cross-scale interaction, and Dynamic Group Gating for adaptive group-wise feature weighting. Extensive experiments demonstrate that DepthPolyp achieves strong cross-dataset generalization when trained on degraded data and evaluated on both clean and noisy target domains, consistently outperforming lightweight baselines and remaining competitive with substantially larger models. In real surgical video evaluation on PolypGen, DepthPolyp achieves better segmentation performance than models up to $20\times$ larger while preserving real-time inference speed. With only 3.57M parameters and 0.86 GMACs, the proposed method runs at over 180 FPS on mobile devices, making it well suited for real-time deployment in resource-constrained clinical environments. Code and pretrained weights are available at: https://github.com/ReaganWu/DepthPolyp/
Problem

Research questions and friction points this paper is trying to address.

polyp segmentation
colonoscopy
real-time
motion blur
illumination instability
Innovation

Methods, ideas, or system contributions that make the work stand out.

pseudo-depth guidance
lightweight segmentation
multi-task learning
efficient feature modulation
real-time colonoscopy
🔎 Similar Papers
Z
Zhuoyu Wu
CyPhi AI Lab, Monash University, Malaysia Campus, Malaysia
W
Wenhui Ou
Department of Electronic & Computer Engineering, Hong Kong University of Science & Technology, Hong Kong, P.R. China
L
Lexi Zhang
Harbin Institute of Technology, Harbin, P.R. China
Pei-Sze Tan
Pei-Sze Tan
Monash University
Affective ComputingCausalityFairness
D
Dongjun Wu
Department of Electronic & Computer Engineering, Hong Kong University of Science & Technology, Hong Kong, P.R. China
J
Junhe Zhao
Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, P.R. China
W
Wenqi Fang
Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, P.R. China
R
Raphaël C. -W. Phan
CyPhi AI Lab, Monash University, Malaysia Campus, Malaysia