🤖 AI Summary
This work proposes ALL U-KAN, the first fully Kolmogorov–Arnold (KA)-based deep architecture for medical image segmentation, addressing the challenges of training instability and excessive GPU memory consumption that have hindered the application of deep KA networks in this domain. By entirely replacing conventional fully connected and convolutional layers with KA layers and KAonv layers, respectively, the model leverages the expressive power of KA representations. To enhance scalability, the authors introduce a Share-activation KAN to reduce parameterization complexity and design a gradient-free spline mechanism that drastically lowers both memory footprint and computational cost. Evaluated on three medical image segmentation benchmarks, ALL U-KAN achieves superior segmentation accuracy while using 10× fewer parameters and consuming over 20× less GPU memory compared to existing partial KA and traditional architectures.
📝 Abstract
Deeply stacked KANs are practically impossible due to high training difficulties and substantial memory requirements. Consequently, existing studies can only incorporate few KAN layers, hindering the comprehensive exploration of KANs. This study overcomes these limitations and introduces the first fully KA-based deep model, demonstrating that KA-based layers can entirely replace traditional architectures in deep learning and achieve superior learning capacity. Specifically, (1) the proposed Share-activation KAN (SaKAN) reformulates Sprecher's variant of Kolmogorov-Arnold representation theorem, which achieves better optimization due to its simplified parameterization and denser training samples, to ease training difficulty, (2) this paper indicates that spline gradients contribute negligibly to training while consuming huge GPU memory, thus proposes the Grad-Free Spline to significantly reduce memory usage and computational overhead. (3) Building on these two innovations, our ALL U-KAN is the first representative implementation of fully KA-based deep model, where the proposed KA and KAonv layers completely replace FC and Conv layers. Extensive evaluations on three medical image segmentation tasks confirm the superiority of the full KA-based architecture compared to partial KA-based and traditional architectures, achieving all higher segmentation accuracy. Compared to directly deeply stacked KAN, ALL U-KAN achieves 10 times reduction in parameter count and reduces memory consumption by more than 20 times, unlocking the new explorations into deep KAN architectures.