🤖 AI Summary
This work investigates the susceptibility of Kolmogorov–Arnold Networks (KANs) to catastrophic forgetting in continual learning and identifies their limitations in high-dimensional tasks. Method: By establishing a theoretical link between activation structure and forgetting dynamics, we reveal that insufficient spline basis function overlap and increased intrinsic data dimensionality fundamentally exacerbate forgetting in KANs. To address this, we propose KAN-LoRA—a lightweight, KAN-specific adapter enabling efficient parameter-efficient fine-tuning. Contribution/Results: Extensive experiments on synthetic data, image classification, and language model knowledge editing demonstrate that while KANs exhibit robustness on low-dimensional algorithmic tasks, they suffer significant forgetting in high-dimensional vision and language modeling. KAN-LoRA achieves substantial improvements in parameter efficiency (<0.5% additional parameters) and cross-task knowledge retention. This work provides the first systematic analytical framework and practical solution for continual learning with KANs.
📝 Abstract
Catastrophic forgetting is a longstanding challenge in continual learning, where models lose knowledge from earlier tasks when learning new ones. While various mitigation strategies have been proposed for Multi-Layer Perceptrons (MLPs), recent architectural advances like Kolmogorov-Arnold Networks (KANs) have been suggested to offer intrinsic resistance to forgetting by leveraging localized spline-based activations. However, the practical behavior of KANs under continual learning remains unclear, and their limitations are not well understood. To address this, we present a comprehensive study of catastrophic forgetting in KANs and develop a theoretical framework that links forgetting to activation support overlap and intrinsic data dimension. We validate these analyses through systematic experiments on synthetic and vision tasks, measuring forgetting dynamics under varying model configurations and data complexity. Further, we introduce KAN-LoRA, a novel adapter design for parameter-efficient continual fine-tuning of language models, and evaluate its effectiveness in knowledge editing tasks. Our findings reveal that while KANs exhibit promising retention in low-dimensional algorithmic settings, they remain vulnerable to forgetting in high-dimensional domains such as image classification and language modeling. These results advance the understanding of KANs' strengths and limitations, offering practical insights for continual learning system design.