LeanKAN: A Parameter-Lean Kolmogorov-Arnold Network Layer with Improved Memory Efficiency and Convergence Behavior

📅 2025-02-25

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

MultKAN suffers from poor output-layer adaptability, parameter redundancy, and complex hyperparameter tuning, limiting its applicability in data-driven modeling. To address these issues, we propose LeanKAN—a minimalist, plug-and-play KAN layer paradigm. Its core innovation lies in reconstructing learnable activation functions grounded in the Kolmogorov–Arnold representation theorem, integrating additive and multiplicative sub-nodes within a compact topology while eliminating auxiliary nonlinear activations and redundant weights. This design enables zero-latency direct output connectivity, substantial parameter reduction, and dramatic hyperparameter simplification. Experiments on synthetic benchmarks and KAN-ODE dynamical system modeling demonstrate that LeanKAN achieves faster convergence and superior generalization with significantly fewer parameters and lower memory overhead, consistently outperforming MultKAN across all metrics.

Technology Category

Application Category

📝 Abstract

The recently proposed Kolmogorov-Arnold network (KAN) is a promising alternative to multi-layer perceptrons (MLPs) for data-driven modeling. While original KAN layers were only capable of representing the addition operator, the recently-proposed MultKAN layer combines addition and multiplication subnodes in an effort to improve representation performance. Here, we find that MultKAN layers suffer from a few key drawbacks including limited applicability in output layers, bulky parameterizations with extraneous activations, and the inclusion of complex hyperparameters. To address these issues, we propose LeanKANs, a direct and modular replacement for MultKAN and traditional AddKAN layers. LeanKANs address these three drawbacks of MultKAN through general applicability as output layers, significantly reduced parameter counts for a given network structure, and a smaller set of hyperparameters. As a one-to-one layer replacement for standard AddKAN and MultKAN layers, LeanKAN is able to provide these benefits to traditional KAN learning problems as well as augmented KAN structures in which it serves as the backbone, such as KAN Ordinary Differential Equations (KAN-ODEs) or Deep Operator KANs (DeepOKAN). We demonstrate LeanKAN's simplicity and efficiency in a series of demonstrations carried out across both a standard KAN toy problem and a KAN-ODE dynamical system modeling problem, where we find that its sparser parameterization and compact structure serve to increase its expressivity and learning capability, leading it to outperform similar and even much larger MultKANs in various tasks.

Problem

Research questions and friction points this paper is trying to address.

Improves memory efficiency

Reduces parameter counts

Simplifies hyperparameters

Innovation

Methods, ideas, or system contributions that make the work stand out.

LeanKAN replaces MultKAN layers

Reduces parameter count significantly

Simplifies hyperparameter set

🔎 Similar Papers

Convolutional Kolmogorov-Arnold Networks