Multilevel Training for Kolmogorov Arnold Networks

📅 2026-03-05

📈 Citations: 0

✨ Influential: 0

career value

229K/year

🤖 AI Summary

This work addresses the inefficiency of conventional neural network training in complex tasks such as physics-informed neural networks (PINNs), where the lack of exploitable structure hinders optimization. For the first time, a multigrid algorithm is introduced into the training of Kolmogorov–Arnold Networks (KANs). By establishing an equivalence between KANs and multi-channel MLPs through basis transformation, the authors design a multilevel optimization strategy that progressively refines spline knots across layers. Leveraging the compact support of spline basis functions and analytical geometric interpolation operators, the method enables lossless transfer and complementary optimization from coarse to fine models. Experiments demonstrate that this approach achieves several orders of magnitude improvement in accuracy over standard KAN or MLP training, significantly accelerating convergence and enhancing generalization across multiple tasks.

Technology Category

Application Category

📝 Abstract

Algorithmic speedup of training common neural architectures is made difficult by the lack of structure guaranteed by the function compositions inherent to such networks. In contrast to multilayer perceptrons (MLPs), Kolmogorov-Arnold networks (KANs) provide more structure by expanding learned activations in a specified basis. This paper exploits this structure to develop practical algorithms and theoretical insights, yielding training speedup via multilevel training for KANs. To do so, we first establish an equivalence between KANs with spline basis functions and multichannel MLPs with power ReLU activations through a linear change of basis. We then analyze how this change of basis affects the geometry of gradient-based optimization with respect to spline knots. The KANs change-of-basis motivates a multilevel training approach, where we train a sequence of KANs naturally defined through a uniform refinement of spline knots with analytic geometric interpolation operators between models. The interpolation scheme enables a ``properly nested hierarchy''of architectures, ensuring that interpolation to a fine model preserves the progress made on coarse models, while the compact support of spline basis functions ensures complementary optimization on subsequent levels. Numerical experiments demonstrate that our multilevel training approach can achieve orders of magnitude improvement in accuracy over conventional methods to train comparable KANs or MLPs, particularly for physics informed neural networks. Finally, this work demonstrates how principled design of neural networks can lead to exploitable structure, and in this case, multilevel algorithms that can dramatically improve training performance.

Problem

Research questions and friction points this paper is trying to address.

Kolmogorov-Arnold Networks

multilevel training

training speedup

neural network structure

spline basis

Innovation

Methods, ideas, or system contributions that make the work stand out.

Kolmogorov-Arnold Networks

multilevel training

spline basis