Free-Knots Kolmogorov-Arnold Network: On the Analysis of Spline Knots and Advancing Stability

📅 2025-01-16
📈 Citations: 0
✨ Influential: 0
📄 PDF
🤖 AI Summary
Addressing three key challenges in Kolmogorov–Arnold Networks (KANs)—training instability, parameter redundancy, and opaque mechanistic behavior of B-spline activation functions—this work proposes a Free-Knot KAN architecture. First, we derive a theoretical upper bound on the number of B-spline knots. Second, we design an adaptive free-knot mechanism that reduces parameter count to the same order as MLPs—approximately one-tenth that of the original KAN. Third, we impose C² continuity constraints and introduce a range-expansion gradient training strategy to enhance activation smoothness and training robustness. Extensive evaluation across eight cross-domain benchmarks—including image, text, time-series, multimodal, and function approximation tasks—demonstrates consistent improvements: our method achieves superior function approximation accuracy and downstream task performance compared to existing KAN variants, while matching or exceeding the performance of MLPs of comparable size.

Technology Category

Application Category

📝 Abstract
Kolmogorov-Arnold Neural Networks (KANs) have gained significant attention in the machine learning community. However, their implementation often suffers from poor training stability and heavy trainable parameter. Furthermore, there is limited understanding of the behavior of the learned activation functions derived from B-splines. In this work, we analyze the behavior of KANs through the lens of spline knots and derive the lower and upper bound for the number of knots in B-spline-based KANs. To address existing limitations, we propose a novel Free Knots KAN that enhances the performance of the original KAN while reducing the number of trainable parameters to match the trainable parameter scale of standard Multi-Layer Perceptrons (MLPs). Additionally, we introduce new a training strategy to ensure $C^2$ continuity of the learnable spline, resulting in smoother activation compared to the original KAN and improve the training stability by range expansion. The proposed method is comprehensively evaluated on 8 datasets spanning various domains, including image, text, time series, multimodal, and function approximation tasks. The promising results demonstrates the feasibility of KAN-based network and the effectiveness of proposed method.
Problem

Research questions and friction points this paper is trying to address.

Kolmogorov-Arnold Neural Networks
Stability Issues
Parameter Optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Kolmogorov-Arnold Neural Network
Smooth Learning Spline
Node Optimization
🔎 Similar Papers
No similar papers found.