Posterior Contraction Rates for Sparse Kolmogorov-Arnold Networks in Anisotropic Besov Spaces

📅 2026-05-12
📈 Citations: 0
Influential: 0
📄 PDF

career value

229K/year
🤖 AI Summary
This work establishes the first Bayesian statistical theoretical framework for the posterior contraction behavior of sparse Bayesian Kolmogorov–Arnold networks (KANs) in anisotropic Besov spaces. Addressing the curse of dimensionality and the challenge of adapting to unknown smoothness in high-dimensional function approximation, the approach employs spike-and-slab sparsity priors together with hyperpriors, controlling model complexity solely through network width, spline grid resolution, and sparsity structure. Theoretical analysis demonstrates that, under a fixed depth, the proposed method achieves nearly minimax-optimal posterior contraction rates that depend on the intrinsic anisotropic smoothness of the target function. Furthermore, the framework naturally extends to composite Besov spaces, offering both adaptivity to unknown smoothness and scalability with respect to dimensionality.
📝 Abstract
We study posterior contraction rates for sparse Bayesian Kolmogorov-Arnold networks (KANs) over anisotropic Besov spaces, providing a statistical foundation of KANs from a Bayesian point of view. We show that sparse Bayesian KANs equipped with spike-and-slab-type sparsity priors attain the near-minimax posterior contraction. In particular, the contraction rate depends on the intrinsic anisotropic smoothness of the underlying function. Moreover, by placing a hyperprior on a single model-size parameter, the resulting posterior adapts to unknown anisotropic smoothness and still achieves the corresponding near-minimax rate. A distinctive feature of our results, compared with those for standard sparse MLP-based models, is that the KAN depth can be kept fixed: owing to the flexibility of learnable spline edge functions, the required approximation complexity is controlled through the network width, spline-grid range and size, and parameter sparsity. Our analysis develops theoretical tools tailored to sparse spline-edge architectures, including approximation and complexity bounds for Bayesian KANs. We then extend to compositional Besov spaces and show that the contraction rates depend on layerwise smoothness and effective dimension of the underlying compositional structure, thereby effectively avoiding the curse of dimensionality. Together, the developed tools and findings advance the theoretical understanding of Bayesian neural networks and provide rigorous statistical foundations for KANs.
Problem

Research questions and friction points this paper is trying to address.

posterior contraction
sparse Kolmogorov-Arnold networks
anisotropic Besov spaces
Bayesian neural networks
curse of dimensionality
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian KAN
posterior contraction
anisotropic Besov spaces
spike-and-slab prior
curse of dimensionality