Softly Symbolifying Kolmogorov-Arnold Networks

📅 2025-11-27

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

Kolmogorov–Arnold Networks (KANs) often lack symbolic interpretability in practice, as their learned activation functions rarely admit closed-form mathematical expressions. Method: We propose Symbol-KAN—a differentiable, sparse, end-to-end architecture that couples symbolic basis primitives with learnable soft gating. It introduces the first differentiable symbolic sparsification mechanism grounded in Minimum Description Length (MDL), enabling adaptive switching between symbol-dominated modeling and spline-based approximation. The framework integrates a symbolic dictionary, differentiable sparsity regularization, and the Kolmogorov–Arnold representation. Contribution/Results: Symbol-KAN achieves state-of-the-art accuracy on symbolic regression, dynamical system forecasting, and real-world tasks, while significantly reducing model size. Notably, spontaneous symbolic emergence occurs even without explicit regularization—demonstrating its intrinsic capacity to induce interpretability.

Technology Category

Application Category

📝 Abstract

Kolmogorov-Arnold Networks (KANs) offer a promising path toward interpretable machine learning: their learnable activations can be studied individually, while collectively fitting complex data accurately. In practice, however, trained activations often lack symbolic fidelity, learning pathological decompositions with no meaningful correspondence to interpretable forms. We propose Softly Symbolified Kolmogorov-Arnold Networks (S2KAN), which integrate symbolic primitives directly into training. Each activation draws from a dictionary of symbolic and dense terms, with learnable gates that sparsify the representation. Crucially, this sparsification is differentiable, enabling end-to-end optimization, and is guided by a principled Minimum Description Length objective. When symbolic terms suffice, S2KAN discovers interpretable forms; when they do not, it gracefully degrades to dense splines. We demonstrate competitive or superior accuracy with substantially smaller models across symbolic benchmarks, dynamical systems forecasting, and real-world prediction tasks, and observe evidence of emergent self-sparsification even without regularization pressure.

Problem

Research questions and friction points this paper is trying to address.

Improves interpretability of KANs by integrating symbolic primitives

Enables differentiable sparsification with Minimum Description Length guidance

Maintains accuracy with smaller models across diverse tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates symbolic primitives into training

Uses learnable gates for sparsification

Employs differentiable Minimum Description Length objective

🔎 Similar Papers

Convolutional Kolmogorov-Arnold Networks