🤖 AI Summary
This work addresses the loss of interpretability in Kolmogorov–Arnold Networks (KANs) caused by high-curvature oscillations in their learnable univariate activation functions—a phenomenon inadequately mitigated by conventional regularization techniques. The authors derive, for the first time, a theoretical upper bound on curvature propagation through function compositions and leverage this insight to introduce a basis-function-agnostic curvature penalty. This penalty explicitly enforces smoothness of the activation functions without compromising predictive accuracy. Experimental results demonstrate that the proposed method substantially reduces activation function curvature while enhancing model interpretability, thereby establishing a novel paradigm for jointly optimizing accuracy and interpretability in scientific machine learning.
📝 Abstract
Kolmogorov-Arnold networks (KANs) offer a potent combination of accuracy and interpretability, thanks to their compositions of learnable univariate activation functions. However, the activations of well-fitting KANs tend to exhibit pathologically high-curvature oscillations, making them difficult to interpret, and standard regularization penalties do not prevent this. Here we derive a basis-agnostic curvature penalty and show that penalized models can maintain accuracy while achieving substantially smoother activations. Accounting for how function composition shapes curvature, we prove an upper bound on the full model's curvature relative to the curvature penalty, and use this to motivate richer forms of penalties. Scientific machine learning is increasingly bottlenecked by the trade-off between accuracy and interpretability. Results such as ours that improve interpretability without sacrificing accuracy will further strengthen KANs as a practical tool for both prediction and insight.