Leveraging KANs for Expedient Training of Multichannel MLPs via Preconditioning and Geometric Refinement

📅 2025-05-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Multi-channel MLPs suffer from low training efficiency. Method: This work establishes, for the first time, a rigorous geometric and algebraic equivalence between free-knot B-spline Kolmogorov–Arnold networks (KANs) and a specific class of multi-channel MLPs. Leveraging this correspondence, we propose a hierarchical geometric refinement strategy along the channel dimension and design an end-to-end preconditioned training framework that jointly optimizes spline knot positions and network weights. The method synergistically integrates KAN’s localized basis function property with multi-channel MLPs’ parallel representational capacity. Contribution/Results: On regression and scientific machine learning benchmarks, our approach achieves up to 3.2× iteration speedup and reduces average test error by 18.7%, significantly improving both training efficiency and generalization accuracy.

Technology Category

Application Category

📝 Abstract
Multilayer perceptrons (MLPs) are a workhorse machine learning architecture, used in a variety of modern deep learning frameworks. However, recently Kolmogorov-Arnold Networks (KANs) have become increasingly popular due to their success on a range of problems, particularly for scientific machine learning tasks. In this paper, we exploit the relationship between KANs and multichannel MLPs to gain structural insight into how to train MLPs faster. We demonstrate the KAN basis (1) provides geometric localized support, and (2) acts as a preconditioned descent in the ReLU basis, overall resulting in expedited training and improved accuracy. Our results show the equivalence between free-knot spline KAN architectures, and a class of MLPs that are refined geometrically along the channel dimension of each weight tensor. We exploit this structural equivalence to define a hierarchical refinement scheme that dramatically accelerates training of the multi-channel MLP architecture. We show further accuracy improvements can be had by allowing the $1$D locations of the spline knots to be trained simultaneously with the weights. These advances are demonstrated on a range of benchmark examples for regression and scientific machine learning.
Problem

Research questions and friction points this paper is trying to address.

Accelerate training of multichannel MLPs using KAN insights
Improve MLP accuracy via geometric refinement and preconditioning
Train spline knot locations with weights for better performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Utilizes KANs for faster MLP training
Employs geometric refinement along channels
Trains spline knots with weights
🔎 Similar Papers
No similar papers found.
J
Jonas A. Actor
Center for Computing Research, Sandia National Laboratories, Albuquerque, NM 87123
G
Graham Harper
Center for Computing Research, Sandia National Laboratories, Albuquerque, NM 87123
Ben Southworth
Ben Southworth
Scientist II, Los Alamos National Laboratory
Multigrid methodsnumerical linear algebrascientific computing
Eric C. Cyr
Eric C. Cyr
Computational Mathematics Department, Sandia National Laboratories
Computational SciencePreconditioningNumerical PDEsScientific Machine Learning