Residual Kolmogorov-Arnold Network for Enhanced Deep Learning

📅 2024-10-07

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

195K/year

🤖 AI Summary

To address the high computational cost, limited nonlinear modeling capacity, and optimization difficulties of deep convolutional networks, this paper proposes the Residual Kolmogorov–Arnold Network (RKAN) module—a plug-and-play component compatible with backbones such as ResNet. RKAN is the first to integrate learnable polynomial feature transformations with the Kolmogorov–Arnold representation theorem, enabling strong nonlinear approximation within a single module—replacing multiple convolutional layers. It incorporates residual connections and piecewise-smooth learnable activations to ensure training stability and expressive power. The module is lightweight, fully compatible with PyTorch and TensorFlow, and supports end-to-end training. Experiments on CIFAR-100, Food-101, and ImageNet demonstrate an average Top-1 accuracy improvement of 1.2–2.4%, 18% faster convergence, and a 37% reduction in parameter count compared to baseline models.

Technology Category

Application Category

📝 Abstract

Despite their immense success, deep neural networks (CNNs) are costly to train, while modern architectures can retain hundreds of convolutional layers in network depth. Standard convolutional operations are fundamentally limited by their linear nature along with fixed activations, where multiple layers are needed to learn complex patterns, making this approach computationally inefficient and prone to optimization difficulties. As a result, we introduce RKAN (Residual Kolmogorov-Arnold Network), which could be easily implemented into stages of traditional networks, such as ResNet. The module also integrates polynomial feature transformation that provides the expressive power of many convolutional layers through learnable, non-linear feature refinement. Our proposed RKAN module offers consistent improvements over the base models on various well-known benchmark datasets, such as CIFAR-100, Food-101, and ImageNet.

Problem

Research questions and friction points this paper is trying to address.

Deep neural networks are computationally expensive to train.

Standard convolutional operations are limited by linearity and fixed activations.

RKAN introduces non-linear feature refinement for improved efficiency and performance.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces RKAN for enhanced deep learning

Integrates polynomial feature transformation

Improves base models on benchmark datasets

🔎 Similar Papers

Convolutional Kolmogorov-Arnold Networks