LHU-Net: A Light Hybrid U-Net for Cost-Efficient, High-Performance Volumetric Medical Image Segmentation

📅 2024-04-07
🏛️ arXiv.org
📈 Citations: 4
Influential: 0
📄 PDF
🤖 AI Summary
Existing CNN-Transformer hybrid models for 3D medical image segmentation suffer from structural redundancy and coupled spatial-channel feature modeling, leading to suboptimal trade-offs between accuracy and efficiency. To address this, we propose LHU-Net, a lightweight hybrid U-shaped network. Its core innovation is the first “spatial-first, channel-second” two-stage feature decoupling paradigm, explicitly separating spatial and channel modeling sequences within a hybrid architecture. LHU-Net employs a lightweight U-Net backbone, a synergistic CNN-Transformer encoder, and an efficient multi-scale fusion module. With only 10.9M parameters and no pretraining or ensemble required, it achieves state-of-the-art performance across five benchmarks—including Synapse, LA, and ACDC—reaching a Dice score of 92.66% on ACDC. Compared to mainstream methods, it reduces parameters by 85% and FLOPs by 75%. The code is publicly available.

Technology Category

Application Category

📝 Abstract
The rise of Transformer architectures has revolutionized medical image segmentation, leading to hybrid models that combine Convolutional Neural Networks (CNNs) and Transformers for enhanced accuracy. However, these models often suffer from increased complexity and overlook the interplay between spatial and channel features, which is vital for segmentation precision. We introduce LHU-Net, a streamlined Hybrid U-Net for volumetric medical image segmentation, designed to first analyze spatial and then channel features for effective feature extraction. Tested on five benchmark datasets (Synapse, LA, Pancreas, ACDC, BRaTS 2018), LHU-Net demonstrated superior efficiency and accuracy, notably achieving a 92.66 Dice score on ACDC with 85% fewer parameters and a quarter of the computational demand compared to leading models. This performance, achieved without pre-training, extra data, or model ensembles, sets new benchmarks for computational efficiency and accuracy in segmentation, using under 11 million parameters. This achievement highlights that balancing computational efficiency with high accuracy in medical image segmentation is feasible. Our implementation of LHU-Net is freely accessible to the research community on GitHub (https://github.com/xmindflow/LHUNet).
Problem

Research questions and friction points this paper is trying to address.

Reducing complexity in hybrid CNN-Transformer medical image segmentation
Improving spatial and channel feature integration for precise segmentation
Achieving high performance with fewer parameters and computational costs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines CNNs and Transformers efficiently
Prioritizes spatial then channel features
Uses fewer parameters and FLOPs
🔎 Similar Papers
No similar papers found.
Y
Yousef Sadegheih
Faculty of Informatics and Data Science, University of Regensburg, Regensburg, 93053, Germany
Afshin Bozorgpour
Afshin Bozorgpour
Sharif University of Technology
Deep LearningComputer VisionImage Processing
Pratibha Kumari
Pratibha Kumari
University of Regensburg
Continual learningAnomaly detectionAdaptive learningConcept driftSurveillance
R
Reza Azad
Faculty of Electrical Engineering and Information Technology, RWTH Aachen University, 52062 Aachen, Germany
Dorit Merhof
Dorit Merhof
Professor, Faculty of Informatics and Computer Science, University of Regensburg