🤖 AI Summary
To address the high computational overhead and poor hardware compatibility of Kolmogorov–Arnold Networks (KANs), this paper proposes SineKAN—a novel architecture that, for the first time, replaces conventional B-spline or Fourier basis functions in KANs with learnable sinusoidal activation functions at the edge level. This design preserves the theoretical expressivity guaranteed by the Kolmogorov–Arnold representation theorem while simultaneously enhancing periodic representation capability, gradient stability, and hardware efficiency. Experiments on visual benchmarks—including image classification—demonstrate that SineKAN matches or exceeds the accuracy of both B-spline and Fourier KANs, achieves significantly faster training convergence, and exhibits numerical precision scalability comparable to dense neural networks. By unifying interpretability, lightweight design, and hardware-aware computation, SineKAN establishes a new paradigm for efficient, theoretically grounded, and interpretable neural networks.
📝 Abstract
Recent work has established an alternative to traditional multi-layer perceptron neural networks in the form of Kolmogorov-Arnold Networks (KAN). The general KAN framework uses learnable activation functions on the edges of the computational graph followed by summation on nodes. The learnable edge activation functions in the original implementation are basis spline functions (B-Spline). Here, we present a model in which learnable grids of B-Spline activation functions are replaced by grids of re-weighted sine functions (SineKAN). We evaluate numerical performance of our model on a benchmark vision task. We show that our model can perform better than or comparable to B-Spline KAN models and an alternative KAN implementation based on periodic cosine and sine functions representing a Fourier Series. Further, we show that SineKAN has numerical accuracy that could scale comparably to dense neural networks (DNNs). Compared to the two baseline KAN models, SineKAN achieves a substantial speed increase at all hidden layer sizes, batch sizes, and depths. Current advantage of DNNs due to hardware and software optimizations are discussed along with theoretical scaling. Additionally, properties of SineKAN compared to other KAN implementations and current limitations are also discussed.