Free-Knots Kolmogorov-Arnold Network: On the Analysis of Spline Knots and Advancing Stability

📅 2025-01-16

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

Addressing three key challenges in Kolmogorov–Arnold Networks (KANs)—training instability, parameter redundancy, and opaque mechanistic behavior of B-spline activation functions—this work proposes a Free-Knot KAN architecture. First, we derive a theoretical upper bound on the number of B-spline knots. Second, we design an adaptive free-knot mechanism that reduces parameter count to the same order as MLPs—approximately one-tenth that of the original KAN. Third, we impose C² continuity constraints and introduce a range-expansion gradient training strategy to enhance activation smoothness and training robustness. Extensive evaluation across eight cross-domain benchmarks—including image, text, time-series, multimodal, and function approximation tasks—demonstrates consistent improvements: our method achieves superior function approximation accuracy and downstream task performance compared to existing KAN variants, while matching or exceeding the performance of MLPs of comparable size.

Technology Category

Application Category

📝 Abstract

Kolmogorov-Arnold Neural Networks (KANs) have gained significant attention in the machine learning community. However, their implementation often suffers from poor training stability and heavy trainable parameter. Furthermore, there is limited understanding of the behavior of the learned activation functions derived from B-splines. In this work, we analyze the behavior of KANs through the lens of spline knots and derive the lower and upper bound for the number of knots in B-spline-based KANs. To address existing limitations, we propose a novel Free Knots KAN that enhances the performance of the original KAN while reducing the number of trainable parameters to match the trainable parameter scale of standard Multi-Layer Perceptrons (MLPs). Additionally, we introduce new a training strategy to ensure $C^2$ continuity of the learnable spline, resulting in smoother activation compared to the original KAN and improve the training stability by range expansion. The proposed method is comprehensively evaluated on 8 datasets spanning various domains, including image, text, time series, multimodal, and function approximation tasks. The promising results demonstrates the feasibility of KAN-based network and the effectiveness of proposed method.

Problem

Research questions and friction points this paper is trying to address.

Kolmogorov-Arnold Neural Networks

Stability Issues

Parameter Optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Kolmogorov-Arnold Neural Network

Smooth Learning Spline

Node Optimization

🔎 Similar Papers

On the Robustness of Kolmogorov-Arnold Networks: An Adversarial Perspective