Mamba Based Feature Extraction And Adaptive Multilevel Feature Fusion For 3D Tumor Segmentation From Multi-modal Medical Image

📅 2025-04-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address weak global modeling capability, high computational overhead, and difficulty in fusing heterogeneous modalities in multimodal 3D medical image tumor segmentation, this paper proposes a novel multimodal segmentation framework based on an improved Mamba architecture. Our method introduces three key innovations: (1) modality-specific Mamba encoders that capture long-range dependencies tailored to individual imaging modalities (e.g., PET, CT, MRI sequences); (2) a two-level collaborative fusion module integrating modality-wise and channel-wise attention for adaptive multi-level feature aggregation; and (3) a lightweight multi-scale decoder to enhance spatial localization accuracy. Evaluated on PET/CT and multi-sequence MRI datasets, our approach consistently outperforms state-of-the-art CNNs, Transformers, and standard Mamba baselines—achieving a +3.2% Dice score improvement while delivering superior robustness and inference efficiency.

Technology Category

Application Category

📝 Abstract
Multi-modal 3D medical image segmentation aims to accurately identify tumor regions across different modalities, facing challenges from variations in image intensity and tumor morphology. Traditional convolutional neural network (CNN)-based methods struggle with capturing global features, while Transformers-based methods, despite effectively capturing global context, encounter high computational costs in 3D medical image segmentation. The Mamba model combines linear scalability with long-distance modeling, making it a promising approach for visual representation learning. However, Mamba-based 3D multi-modal segmentation still struggles to leverage modality-specific features and fuse complementary information effectively. In this paper, we propose a Mamba based feature extraction and adaptive multilevel feature fusion for 3D tumor segmentation using multi-modal medical image. We first develop the specific modality Mamba encoder to efficiently extract long-range relevant features that represent anatomical and pathological structures present in each modality. Moreover, we design an bi-level synergistic integration block that dynamically merges multi-modal and multi-level complementary features by the modality attention and channel attention learning. Lastly, the decoder combines deep semantic information with fine-grained details to generate the tumor segmentation map. Experimental results on medical image datasets (PET/CT and MRI multi-sequence) show that our approach achieve competitive performance compared to the state-of-the-art CNN, Transformer, and Mamba-based approaches.
Problem

Research questions and friction points this paper is trying to address.

Accurate 3D tumor segmentation across multi-modal medical images
Effective fusion of modality-specific and complementary features
Balancing computational efficiency with global context modeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

Mamba encoder extracts long-range modality-specific features
Bi-level block dynamically fuses multi-modal complementary features
Decoder combines semantic and fine details for segmentation
🔎 Similar Papers
No similar papers found.
Z
Zexin Ji
School of Computer Science and Engineering, Central South University, Changsha, 410083, China; Hunan Engineering Research Center of Machine Vision and Intelligent Medicine, Central South University, Changsha, 410083, China
B
Beiji Zou
School of Computer Science and Engineering, Central South University, Changsha, 410083, China; Hunan Engineering Research Center of Machine Vision and Intelligent Medicine, Central South University, Changsha, 410083, China
X
Xiaoyan Kui
School of Computer Science and Engineering, Central South University, Changsha, 410083, China; Hunan Engineering Research Center of Machine Vision and Intelligent Medicine, Central South University, Changsha, 410083, China
H
Hua Li
Department of Radiation Oncology, Washington University in St. Louis, USA
P
Pierre Vera
Department of Nuclear Medicine, Henri Becquerel Cancer Center, Rouen, France
Su Ruan
Su Ruan
Université de Rouen Normandie, France
data fusionmedical image analysis and processingmachine learning