🤖 AI Summary
MultKAN suffers from poor output-layer adaptability, parameter redundancy, and complex hyperparameter tuning, limiting its applicability in data-driven modeling. To address these issues, we propose LeanKAN—a minimalist, plug-and-play KAN layer paradigm. Its core innovation lies in reconstructing learnable activation functions grounded in the Kolmogorov–Arnold representation theorem, integrating additive and multiplicative sub-nodes within a compact topology while eliminating auxiliary nonlinear activations and redundant weights. This design enables zero-latency direct output connectivity, substantial parameter reduction, and dramatic hyperparameter simplification. Experiments on synthetic benchmarks and KAN-ODE dynamical system modeling demonstrate that LeanKAN achieves faster convergence and superior generalization with significantly fewer parameters and lower memory overhead, consistently outperforming MultKAN across all metrics.
📝 Abstract
The recently proposed Kolmogorov-Arnold network (KAN) is a promising alternative to multi-layer perceptrons (MLPs) for data-driven modeling. While original KAN layers were only capable of representing the addition operator, the recently-proposed MultKAN layer combines addition and multiplication subnodes in an effort to improve representation performance. Here, we find that MultKAN layers suffer from a few key drawbacks including limited applicability in output layers, bulky parameterizations with extraneous activations, and the inclusion of complex hyperparameters. To address these issues, we propose LeanKANs, a direct and modular replacement for MultKAN and traditional AddKAN layers. LeanKANs address these three drawbacks of MultKAN through general applicability as output layers, significantly reduced parameter counts for a given network structure, and a smaller set of hyperparameters. As a one-to-one layer replacement for standard AddKAN and MultKAN layers, LeanKAN is able to provide these benefits to traditional KAN learning problems as well as augmented KAN structures in which it serves as the backbone, such as KAN Ordinary Differential Equations (KAN-ODEs) or Deep Operator KANs (DeepOKAN). We demonstrate LeanKAN's simplicity and efficiency in a series of demonstrations carried out across both a standard KAN toy problem and a KAN-ODE dynamical system modeling problem, where we find that its sparser parameterization and compact structure serve to increase its expressivity and learning capability, leading it to outperform similar and even much larger MultKANs in various tasks.