🤖 AI Summary
Existing CNN-Transformer hybrid models for 3D medical image segmentation suffer from structural redundancy and coupled spatial-channel feature modeling, leading to suboptimal trade-offs between accuracy and efficiency. To address this, we propose LHU-Net, a lightweight hybrid U-shaped network. Its core innovation is the first “spatial-first, channel-second” two-stage feature decoupling paradigm, explicitly separating spatial and channel modeling sequences within a hybrid architecture. LHU-Net employs a lightweight U-Net backbone, a synergistic CNN-Transformer encoder, and an efficient multi-scale fusion module. With only 10.9M parameters and no pretraining or ensemble required, it achieves state-of-the-art performance across five benchmarks—including Synapse, LA, and ACDC—reaching a Dice score of 92.66% on ACDC. Compared to mainstream methods, it reduces parameters by 85% and FLOPs by 75%. The code is publicly available.
📝 Abstract
The rise of Transformer architectures has revolutionized medical image segmentation, leading to hybrid models that combine Convolutional Neural Networks (CNNs) and Transformers for enhanced accuracy. However, these models often suffer from increased complexity and overlook the interplay between spatial and channel features, which is vital for segmentation precision. We introduce LHU-Net, a streamlined Hybrid U-Net for volumetric medical image segmentation, designed to first analyze spatial and then channel features for effective feature extraction. Tested on five benchmark datasets (Synapse, LA, Pancreas, ACDC, BRaTS 2018), LHU-Net demonstrated superior efficiency and accuracy, notably achieving a 92.66 Dice score on ACDC with 85% fewer parameters and a quarter of the computational demand compared to leading models. This performance, achieved without pre-training, extra data, or model ensembles, sets new benchmarks for computational efficiency and accuracy in segmentation, using under 11 million parameters. This achievement highlights that balancing computational efficiency with high accuracy in medical image segmentation is feasible. Our implementation of LHU-Net is freely accessible to the research community on GitHub (https://github.com/xmindflow/LHUNet).