Convolutional Kolmogorov-Arnold Networks

📅 2024-06-19
🏛️ arXiv.org
📈 Citations: 57
Influential: 4
📄 PDF
🤖 AI Summary
To address parameter redundancy and limited representational efficiency in convolutional neural networks (CNNs), this paper proposes Convolutional Kolmogorov–Arnold Networks (ConvKANs), the first framework to embed per-parameter learnable spline activations directly into convolutional kernels—explicitly enhancing intra-kernel nonlinear modeling capacity. Grounded in the Kolmogorov–Arnold representation theorem, we design parameterized convolutional kernels and an end-to-end differentiable training framework. Evaluated on Fashion-MNIST, ConvKANs achieve accuracy comparable to standard CNNs while reducing parameter count by 50%, demonstrating substantial gains in per-kernel representational efficiency. Our key contributions are: (i) the pioneering integration of the Kolmogorov–Arnold Network (KAN) paradigm into convolutional architectures; and (ii) a novel, theoretically grounded paradigm for lightweight visual models that redefines kernel-level nonlinearity and parameter efficiency.

Technology Category

Application Category

📝 Abstract
In this paper, we introduce Convolutional Kolmogorov-Arnold Networks (Convolutional KANs), an innovative alternative to the standard Convolutional Neural Networks (CNNs) that have revolutionized the field of computer vision. By integrating the learneable non-linear activation functions presented in Kolmogorov-Arnold Networks (KANs) into convolutions, we propose a new layer. Throughout the paper, we empirically validate the performance of Convolutional KANs against traditional architectures across Fashion-MNIST dataset, finding that, in some cases, this new approach maintains a similar level of accuracy while using half the number of parameters. This experiments show that KAN Convolutions seem to learn more per kernel, which opens up a new horizon of possibilities in deep learning for computer vision.
Problem

Research questions and friction points this paper is trying to address.

Replacing fixed-weight kernels with learnable non-linear functions
Improving parameter efficiency in convolutional neural networks
Enhancing expressive power with fewer resources
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates learnable spline-based activation functions
Replaces fixed-weight kernels with learnable functions
Improves parameter efficiency and expressive power
A
Alexander Dylan Bodner
Universidad de San Andrés, Buenos Aires, Argentina
A
Antonio Santiago Tepsich
Universidad de San Andrés, Buenos Aires, Argentina
J
Jack Natan Spolski
Universidad de San Andrés, Buenos Aires, Argentina
S
Santiago Pourteau
Universidad de San Andrés, Buenos Aires, Argentina