🤖 AI Summary
To address the excessive parameter count of Kolmogorov–Arnold Networks (KANs)—a key bottleneck hindering their practical deployment—this work proposes Parameter-Reduced KANs (PRKANs). Methodologically, PRKAN integrates Gaussian radial basis function (GRBF) activations, layer normalization, learnable piecewise spline parameterization, and a lightweight attention mechanism. Theoretically, we establish that GRBF activation and layer normalization jointly enhance both the approximation capability and training stability of KANs. Empirically, PRKAN achieves MLP-level accuracy on MNIST and Fashion-MNIST while reducing parameter counts to the same order of magnitude as comparable MLPs—marking the first such demonstration for KANs. This work provides the first systematic architectural compression framework and empirical benchmark enabling efficient, practical KAN deployment. Code is publicly available.
📝 Abstract
Kolmogorov-Arnold Networks (KANs) represent an innovation in neural network architectures, offering a compelling alternative to Multi-Layer Perceptrons (MLPs) in models such as Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), and Transformers. By advancing network design, KANs are driving groundbreaking research and enabling transformative applications across various scientific domains involving neural networks. However, existing KANs often require significantly more parameters in their network layers compared to MLPs. To address this limitation, this paper introduces PRKANs ( extbf{P}arameter- extbf{R}educed extbf{K}olmogorov- extbf{A}rnold extbf{N}etworks), which employ several methods to reduce the parameter count in KAN layers, making them comparable to MLP layers. Experimental results on the MNIST and Fashion-MNIST datasets demonstrate that PRKANs with attention mechanisms outperform several existing KANs and rival the performance of MLPs, albeit with slightly longer training times. Furthermore, the study highlights the advantages of Gaussian Radial Basis Functions (GRBFs) and layer normalization in KAN designs. The repository for this work is available at: url{https://github.com/hoangthangta/All-KAN}.