🤖 AI Summary
To address degraded polyp segmentation performance caused by low illumination, blur, and overexposure in endoscopic images, this paper proposes a CLIP-semantic-guided and reinforcement learning-driven dynamic image enhancement–segmentation joint framework. The method introduces CLIP for semantic-level image quality assessment and establishes an end-to-end closed-loop optimization pipeline integrating multimodal enhancement strategies with a lightweight U-Net segmenter. A Proximal Policy Optimization (PPO)-based reinforcement learning agent dynamically schedules enhancement operators—including denoising, contrast adjustment, and sharpening—enabling quality-aware adaptive enhancement. Evaluated on Kvasir-SEG and CVC-ClinicDB, the framework achieves a Dice score of 93.7%, surpassing state-of-the-art methods by 2.1%. It significantly improves robustness on low-quality endoscopic images and supports real-time deployment on embedded devices.
📝 Abstract
Since human and environmental factors interfere, captured polyp images usually suffer from issues such as dim lighting, blur, and overexposure, which pose challenges for downstream polyp segmentation tasks. To address the challenges of noise-induced degradation in polyp images, we present AgentPolyp, a novel framework integrating CLIP-based semantic guidance and dynamic image enhancement with a lightweight neural network for segmentation. The agent first evaluates image quality using CLIP-driven semantic analysis (e.g., identifying ``low-contrast polyps with vascular textures") and adapts reinforcement learning strategies to dynamically apply multi-modal enhancement operations (e.g., denoising, contrast adjustment). A quality assessment feedback loop optimizes pixel-level enhancement and segmentation focus in a collaborative manner, ensuring robust preprocessing before neural network segmentation. This modular architecture supports plug-and-play extensions for various enhancement algorithms and segmentation networks, meeting deployment requirements for endoscopic devices.