Learning with Geometric Priors in U-Net Variants for Polyp Segmentation

📅 2026-01-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing U-Net variants struggle to accurately capture the geometric structure of polyps in low-contrast or complex colonoscopy images, limiting their segmentation performance. To address this, this work proposes a plug-and-play Geometric Prior Module (GPM) that, for the first time, explicitly incorporates geometric priors in the form of depth maps into polyp segmentation. The GPM leverages a Visual Geometry Grounded Transformer (VGGT), fine-tuned on a simulated ColonDepth dataset, to generate endoscopy-specific depth maps, which are then injected into the U-Net encoder features. By integrating spatial and channel attention mechanisms, the module enhances the fusion of local and global contextual information. Extensive experiments demonstrate that GPM significantly outperforms three mainstream baselines across five public datasets, confirming its robustness, generalizability, and compatibility with diverse U-Net architectures.

Technology Category

Application Category

📝 Abstract
Accurate and robust polyp segmentation is essential for early colorectal cancer detection and for computer-aided diagnosis. While convolutional neural network-, Transformer-, and Mamba-based U-Net variants have achieved strong performance, they still struggle to capture geometric and structural cues, especially in low-contrast or cluttered colonoscopy scenes. To address this challenge, we propose a novel Geometric Prior-guided Module (GPM) that injects explicit geometric priors into U-Net-based architectures for polyp segmentation. Specifically, we fine-tune the Visual Geometry Grounded Transformer (VGGT) on a simulated ColonDepth dataset to estimate depth maps of polyp images tailored to the endoscopic domain. These depth maps are then processed by GPM to encode geometric priors into the encoder's feature maps, where they are further refined using spatial and channel attention mechanisms that emphasize both local spatial and global channel information. GPM is plug-and-play and can be seamlessly integrated into diverse U-Net variants. Extensive experiments on five public polyp segmentation datasets demonstrate consistent gains over three strong baselines. Code and the generated depth maps are available at: https://github.com/fvazqu/GPM-PolypSeg
Problem

Research questions and friction points this paper is trying to address.

polyp segmentation
geometric priors
U-Net variants
colonoscopy images
structural cues
Innovation

Methods, ideas, or system contributions that make the work stand out.

Geometric Prior
Polyp Segmentation
U-Net Variants
Depth Estimation
Attention Mechanism
🔎 Similar Papers
No similar papers found.
F
Fabian Vazquez
The University of Texas Rio Grande Valley, Edinburg, TX, USA
J
J. A. Nuñez
The University of Texas Rio Grande Valley, Edinburg, TX, USA
D
Diego Adame
The University of Texas Rio Grande Valley, Edinburg, TX, USA
A
Alissen Moreno
The University of Texas Rio Grande Valley, Edinburg, TX, USA
A
Augustin Zhan
Sewickley Academy, Sewickley, PA, USA
Huimin Li
Huimin Li
Ph.D. @ TU Delft/Postdoc @ TU Darmstadt
Hardware SecurityRISC-VSCAMLFPGA
J
Jinghao Yang
The University of Texas Rio Grande Valley, Edinburg, TX, USA
Haoteng Tang
Haoteng Tang
Assistant Professor in Computer Science, University of Texas Rio Grande Valley.
machine learningdata miningmedical image computing and bioinformatics
B
Bin Fu
The University of Texas Rio Grande Valley, Edinburg, TX, USA
Pengfei Gu
Pengfei Gu
Assistant Professor in Computer Science, University of Texas Rio Grande Valley
Computer VisionDeep LearningMedical Image AnalysisScientific Visualization