Topo-VM-UNetV2: Encoding Topology into Vision Mamba UNet for Polyp Segmentation

📅 2025-05-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Mamba-based architectures struggle to model topological structures—such as connected components and holes—in polyp segmentation, leading to ambiguous boundaries. To address this, we propose Topo-SDI, a topology-guided semantic-detail integration module. Topo-SDI is the first to incorporate topological attention maps derived from persistent homology into the Vision Mamba framework (VM-UNetV2), enabling explicit topological guidance during segmentation. It employs sigmoid-weighted mapping and multi-scale feature fusion to jointly optimize geometric fidelity and long-range semantic representation. Evaluated on five public polyp datasets, Topo-SDI achieves state-of-the-art performance, significantly improving boundary precision and robustness for small polyps. The source code will be made publicly available.

Technology Category

Application Category

📝 Abstract
Convolutional neural network (CNN) and Transformer-based architectures are two dominant deep learning models for polyp segmentation. However, CNNs have limited capability for modeling long-range dependencies, while Transformers incur quadratic computational complexity. Recently, State Space Models such as Mamba have been recognized as a promising approach for polyp segmentation because they not only model long-range interactions effectively but also maintain linear computational complexity. However, Mamba-based architectures still struggle to capture topological features (e.g., connected components, loops, voids), leading to inaccurate boundary delineation and polyp segmentation. To address these limitations, we propose a new approach called Topo-VM-UNetV2, which encodes topological features into the Mamba-based state-of-the-art polyp segmentation model, VM-UNetV2. Our method consists of two stages: Stage 1: VM-UNetV2 is used to generate probability maps (PMs) for the training and test images, which are then used to compute topology attention maps. Specifically, we first compute persistence diagrams of the PMs, then we generate persistence score maps by assigning persistence values (i.e., the difference between death and birth times) of each topological feature to its birth location, finally we transform persistence scores into attention weights using the sigmoid function. Stage 2: These topology attention maps are integrated into the semantics and detail infusion (SDI) module of VM-UNetV2 to form a topology-guided semantics and detail infusion (Topo-SDI) module for enhancing the segmentation results. Extensive experiments on five public polyp segmentation datasets demonstrate the effectiveness of our proposed method. The code will be made publicly available.
Problem

Research questions and friction points this paper is trying to address.

Encode topology into Mamba-based model for polyp segmentation
Address limited long-range dependency in CNNs and Transformers
Improve boundary delineation with topological feature integration
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates topology into Mamba-based UNet model
Uses persistence diagrams for topology attention maps
Enhances segmentation with Topo-SDI module
🔎 Similar Papers
D
Diego Adame
Department of Computer Science, University of Texas Rio Grande Valley, Edinburg, TX 78539, USA
J
J. A. Nunez
Department of Computer Science, University of Texas Rio Grande Valley, Edinburg, TX 78539, USA
F
Fabian Vazquez
Department of Computer Science, University of Texas Rio Grande Valley, Edinburg, TX 78539, USA
N
Nayeli Gurrola
Department of Computer Science, University of Texas Rio Grande Valley, Edinburg, TX 78539, USA
Huimin Li
Huimin Li
Ph.D. @ TU Delft/Postdoc @ TU Darmstadt
Hardware SecurityRISC-VSCAMLFPGA
Haoteng Tang
Haoteng Tang
Assistant Professor in Computer Science, University of Texas Rio Grande Valley.
machine learningdata miningmedical image computing and bioinformatics
B
Bin Fu
Department of Computer Science, University of Texas Rio Grande Valley, Edinburg, TX 78539, USA
Pengfei Gu
Pengfei Gu
Assistant Professor in Computer Science, University of Texas Rio Grande Valley
Computer VisionDeep LearningMedical Image AnalysisScientific Visualization