LM-Net: A Light-weight and Multi-scale Network for Medical Image Segmentation

📅 2025-01-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address boundary ambiguity and over-/under-segmentation in medical image segmentation—caused by insufficient multi-scale modeling and poor integration of local texture with global semantics—this paper proposes LM-Net, a lightweight multi-scale network. Methodologically, LM-Net introduces: (1) a novel CNN-ViT hybrid multi-branch architecture; (2) Local Feature Transformer (LFT) and Global Feature Transformer (GFT) modules that jointly enable intra-layer multi-scale perception and inter-layer local-global协同 modeling; and (3) lightweight convolutional blocks coupled with hierarchical attention mechanisms to efficiently fuse CNN-captured local details with ViT-derived global semantics. Evaluated on three public multi-modal datasets, LM-Net achieves state-of-the-art performance with only 4.66 G FLOPs and 5.4 M parameters, significantly improving both boundary delineation accuracy and overall segmentation precision.

Technology Category

Application Category

📝 Abstract
Current medical image segmentation approaches have limitations in deeply exploring multi-scale information and effectively combining local detail textures with global contextual semantic information. This results in over-segmentation, under-segmentation, and blurred segmentation boundaries. To tackle these challenges, we explore multi-scale feature representations from different perspectives, proposing a novel, lightweight, and multi-scale architecture (LM-Net) that integrates advantages of both Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) to enhance segmentation accuracy. LM-Net employs a lightweight multi-branch module to capture multi-scale features at the same level. Furthermore, we introduce two modules to concurrently capture local detail textures and global semantics with multi-scale features at different levels: the Local Feature Transformer (LFT) and Global Feature Transformer (GFT). The LFT integrates local window self-attention to capture local detail textures, while the GFT leverages global self-attention to capture global contextual semantics. By combining these modules, our model achieves complementarity between local and global representations, alleviating the problem of blurred segmentation boundaries in medical image segmentation. To evaluate the feasibility of LM-Net, extensive experiments have been conducted on three publicly available datasets with different modalities. Our proposed model achieves state-of-the-art results, surpassing previous methods, while only requiring 4.66G FLOPs and 5.4M parameters. These state-of-the-art results on three datasets with different modalities demonstrate the effectiveness and adaptability of our proposed LM-Net for various medical image segmentation tasks.
Problem

Research questions and friction points this paper is trying to address.

Medical Image Segmentation
Multi-scale Information
Boundary Clarity
Innovation

Methods, ideas, or system contributions that make the work stand out.

LM-Net
Local and Global Feature Transformation
Efficient Medical Image Segmentation
🔎 Similar Papers
No similar papers found.
Z
Zhenkun Lu
College of Electronic Information, Guangxi Minzu University, Nanning, China
Chaoyin She
Chaoyin She
Northwestern Polytechnical University
Medical imagingVisual Language ModelEmbodied Intelligence
W
Wei Wang
Department of Medical Ultrasonics, Institute of Diagnostic and Interventional Ultrasound, The First Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China
Qinghua Huang
Qinghua Huang
Professor of Biomedical Engineering, Northwestern Polytechnical University
Medical imagingUltrasoundPattern RecognitionData mining