EndoCaver: Handling Fog, Blur and Glare in Endoscopic Images via Joint Deblurring-Segmentation

📅 2026-01-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the degradation in polyp detection performance caused by haze, motion blur, and specular reflections in clinical endoscopic images. To tackle this challenge, the authors propose a lightweight Transformer architecture featuring a unidirectionally guided dual-decoder design that jointly optimizes deblurring and segmentation. Key innovations include a Global Attention Module (GAM) for cross-scale feature aggregation, a Deblurring-Segmentation Aligner (DSA) to facilitate inter-task feature transfer, and a Learnable Cosine Scheduler (LoCoS) for dynamically balancing multi-task learning. Evaluated on the Kvasir-SEG dataset, the model achieves Dice scores of 0.922 and 0.889 on clean and severely degraded images, respectively, while reducing parameter count by 90% compared to existing methods, demonstrating strong potential for edge deployment.

Technology Category

Application Category

📝 Abstract
Endoscopic image analysis is vital for colorectal cancer screening, yet real-world conditions often suffer from lens fogging, motion blur, and specular highlights, which severely compromise automated polyp detection. We propose EndoCaver, a lightweight transformer with a unidirectional-guided dual-decoder architecture, enabling joint multi-task capability for image deblurring and segmentation while significantly reducing computational complexity and model parameters. Specifically, it integrates a Global Attention Module (GAM) for cross-scale aggregation, a Deblurring-Segmentation Aligner (DSA) to transfer restoration cues, and a cosine-based scheduler (LoCoS) for stable multi-task optimisation. Experiments on the Kvasir-SEG dataset show that EndoCaver achieves 0.922 Dice on clean data and 0.889 under severe image degradation, surpassing state-of-the-art methods while reducing model parameters by 90%. These results demonstrate its efficiency and robustness, making it well-suited for on-device clinical deployment. Code is available at https://github.com/ReaganWu/EndoCaver.
Problem

Research questions and friction points this paper is trying to address.

endoscopic image degradation
lens fogging
motion blur
specular highlights
polyp detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

joint deblurring-segmentation
lightweight transformer
dual-decoder architecture
multi-task optimization
endoscopic image restoration
🔎 Similar Papers
No similar papers found.
Z
Zhuoyu Wu
CyPhi (ΨΦ) AI Research Lab, School of IT, Monash University, Malaysia Campus
W
Wenhui Ou
Department of Electronic & Computer Engineering, The Hong Kong University of Science and Technology
Pei-Sze Tan
Pei-Sze Tan
Monash University
Affective ComputingCausalityFairness
J
Jiayan Yang
Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences
W
Wenqi Fang
Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences
Zheng Wang
Zheng Wang
Associate Professor, Shenzhen Institutes of Advanced Technology
VLSIEDAComputer ArchitectureMachine Learning
R
Raphael C.-W. Phan
CyPhi (ΨΦ) AI Research Lab, School of IT, Monash University, Malaysia Campus