Convolutional Kolmogorov-Arnold Networks

📅 2024-06-19

🏛️ arXiv.org

📈 Citations: 57

✨ Influential: 4

career value

194K/year

🤖 AI Summary

To address parameter redundancy and limited representational efficiency in convolutional neural networks (CNNs), this paper proposes Convolutional Kolmogorov–Arnold Networks (ConvKANs), the first framework to embed per-parameter learnable spline activations directly into convolutional kernels—explicitly enhancing intra-kernel nonlinear modeling capacity. Grounded in the Kolmogorov–Arnold representation theorem, we design parameterized convolutional kernels and an end-to-end differentiable training framework. Evaluated on Fashion-MNIST, ConvKANs achieve accuracy comparable to standard CNNs while reducing parameter count by 50%, demonstrating substantial gains in per-kernel representational efficiency. Our key contributions are: (i) the pioneering integration of the Kolmogorov–Arnold Network (KAN) paradigm into convolutional architectures; and (ii) a novel, theoretically grounded paradigm for lightweight visual models that redefines kernel-level nonlinearity and parameter efficiency.

Technology Category

Application Category

📝 Abstract

In this paper, we introduce Convolutional Kolmogorov-Arnold Networks (Convolutional KANs), an innovative alternative to the standard Convolutional Neural Networks (CNNs) that have revolutionized the field of computer vision. By integrating the learneable non-linear activation functions presented in Kolmogorov-Arnold Networks (KANs) into convolutions, we propose a new layer. Throughout the paper, we empirically validate the performance of Convolutional KANs against traditional architectures across Fashion-MNIST dataset, finding that, in some cases, this new approach maintains a similar level of accuracy while using half the number of parameters. This experiments show that KAN Convolutions seem to learn more per kernel, which opens up a new horizon of possibilities in deep learning for computer vision.

Problem

Research questions and friction points this paper is trying to address.

Replacing fixed-weight kernels with learnable non-linear functions

Improving parameter efficiency in convolutional neural networks

Enhancing expressive power with fewer resources

Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates learnable spline-based activation functions

Replaces fixed-weight kernels with learnable functions

Improves parameter efficiency and expressive power

🔎 Similar Papers

Residual Kolmogorov-Arnold Network for Enhanced Deep Learning