🤖 AI Summary
To address parameter redundancy and limited representational efficiency in convolutional neural networks (CNNs), this paper proposes Convolutional Kolmogorov–Arnold Networks (ConvKANs), the first framework to embed per-parameter learnable spline activations directly into convolutional kernels—explicitly enhancing intra-kernel nonlinear modeling capacity. Grounded in the Kolmogorov–Arnold representation theorem, we design parameterized convolutional kernels and an end-to-end differentiable training framework. Evaluated on Fashion-MNIST, ConvKANs achieve accuracy comparable to standard CNNs while reducing parameter count by 50%, demonstrating substantial gains in per-kernel representational efficiency. Our key contributions are: (i) the pioneering integration of the Kolmogorov–Arnold Network (KAN) paradigm into convolutional architectures; and (ii) a novel, theoretically grounded paradigm for lightweight visual models that redefines kernel-level nonlinearity and parameter efficiency.
📝 Abstract
In this paper, we introduce Convolutional Kolmogorov-Arnold Networks (Convolutional KANs), an innovative alternative to the standard Convolutional Neural Networks (CNNs) that have revolutionized the field of computer vision. By integrating the learneable non-linear activation functions presented in Kolmogorov-Arnold Networks (KANs) into convolutions, we propose a new layer. Throughout the paper, we empirically validate the performance of Convolutional KANs against traditional architectures across Fashion-MNIST dataset, finding that, in some cases, this new approach maintains a similar level of accuracy while using half the number of parameters. This experiments show that KAN Convolutions seem to learn more per kernel, which opens up a new horizon of possibilities in deep learning for computer vision.