Scaling of Gaussian Kolmogorov--Arnold Networks

📅 2026-04-22

📈 Citations: 0

✨ Influential: 0

career value

236K/year

🤖 AI Summary

This work addresses the limited understanding of how the scale parameter ε influences the performance of Gaussian Kolmogorov–Arnold Networks (KANs), particularly in deep architectures. Through an analysis of the geometry of first-layer features, condition numbers, and approximation behavior, the study reveals that ε is primarily governed by the structure of the input domain in the first layer and identifies its effective operating range as [1/(G−1), 2/(G−1)]. This insight transforms ε selection from empirical tuning into an interpretable design principle applicable to scenarios involving fixed or variable scales and constrained training. Experiments demonstrate that properly scaled Gaussian KANs achieve approximation accuracy comparable to that of standard KAN bases in both function approximation and physics-informed Helmholtz problems.

Technology Category

Application Category

📝 Abstract

The Gaussian scale parameter $ε$ is central to the behavior of Gaussian Kolmogorov--Arnold Networks (KANs), yet its role in deep edge-based architectures has not been studied systematically. In this paper, we investigate how $ε$ affects Gaussian KANs through first-layer feature geometry, conditioning, and approximation behavior. Our central observation is that scale selection is governed primarily by the first layer, since it is the only layer constructed directly on the input domain and any loss of distinguishability introduced there cannot be recovered by later layers. From this viewpoint, we analyze the first-layer feature matrix and identify a practical operating interval, \[ ε\in \left[\frac{1}{G-1},\frac{2}{G-1}\right], \] where $G$ denotes the number of Gaussian centers. For the standard shared-center Gaussian KAN used in current practice, we interpret this interval not as a universal optimality result, but as a stable and effective design rule, and validate it through brute-force sweeps over $ε$ across function-approximation problems with different collocation densities, grid resolutions, network architectures, and input dimensions, as well as a physics-informed Helmholtz problem. We further show that this range is useful for fixed-scale selection, variable-scale constructions, constrained training of $ε$, and efficient scale search using early training MSE. Finally, using a matched Chebyshev reference, we show that a properly scaled Gaussian KAN can already be competitive in accuracy relative to another standard KAN basis. In this way, the paper positions scale selection as a practical design principle for Gaussian KANs rather than as an ad hoc hyperparameter choice.

Problem

Research questions and friction points this paper is trying to address.

Gaussian KANs

scale parameter

feature geometry

function approximation

Kolmogorov–Arnold Networks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Gaussian KANs

scale parameter

feature geometry