LHU-Net: A Light Hybrid U-Net for Cost-Efficient, High-Performance Volumetric Medical Image Segmentation

📅 2024-04-07

🏛️ arXiv.org

📈 Citations: 4

✨ Influential: 0

career value

188K/year

🤖 AI Summary

Existing CNN-Transformer hybrid models for 3D medical image segmentation suffer from structural redundancy and coupled spatial-channel feature modeling, leading to suboptimal trade-offs between accuracy and efficiency. To address this, we propose LHU-Net, a lightweight hybrid U-shaped network. Its core innovation is the first “spatial-first, channel-second” two-stage feature decoupling paradigm, explicitly separating spatial and channel modeling sequences within a hybrid architecture. LHU-Net employs a lightweight U-Net backbone, a synergistic CNN-Transformer encoder, and an efficient multi-scale fusion module. With only 10.9M parameters and no pretraining or ensemble required, it achieves state-of-the-art performance across five benchmarks—including Synapse, LA, and ACDC—reaching a Dice score of 92.66% on ACDC. Compared to mainstream methods, it reduces parameters by 85% and FLOPs by 75%. The code is publicly available.

Technology Category

Application Category

📝 Abstract

The rise of Transformer architectures has revolutionized medical image segmentation, leading to hybrid models that combine Convolutional Neural Networks (CNNs) and Transformers for enhanced accuracy. However, these models often suffer from increased complexity and overlook the interplay between spatial and channel features, which is vital for segmentation precision. We introduce LHU-Net, a streamlined Hybrid U-Net for volumetric medical image segmentation, designed to first analyze spatial and then channel features for effective feature extraction. Tested on five benchmark datasets (Synapse, LA, Pancreas, ACDC, BRaTS 2018), LHU-Net demonstrated superior efficiency and accuracy, notably achieving a 92.66 Dice score on ACDC with 85% fewer parameters and a quarter of the computational demand compared to leading models. This performance, achieved without pre-training, extra data, or model ensembles, sets new benchmarks for computational efficiency and accuracy in segmentation, using under 11 million parameters. This achievement highlights that balancing computational efficiency with high accuracy in medical image segmentation is feasible. Our implementation of LHU-Net is freely accessible to the research community on GitHub (https://github.com/xmindflow/LHUNet).

Problem

Research questions and friction points this paper is trying to address.

Reducing complexity in hybrid CNN-Transformer medical image segmentation

Improving spatial and channel feature integration for precise segmentation

Achieving high performance with fewer parameters and computational costs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines CNNs and Transformers efficiently

Prioritizes spatial then channel features

Uses fewer parameters and FLOPs

🔎 Similar Papers

No similar papers found.