KAN we improve on HEP classification tasks? Kolmogorov-Arnold Networks applied to an LHC physics example

📅 2024-08-05

🏛️ arXiv.org

📈 Citations: 4

✨ Influential: 0

career value

229K/year

🤖 AI Summary

This work investigates the performance and interpretability of Kolmogorov–Arnold Networks (KANs) on high-energy physics (HEP) binary classification tasks, systematically benchmarking them against multilayer perceptrons (MLPs). Method: We design KAN architectures tailored to high-dimensional HEP features and conduct end-to-end training coupled with rigorous interpretability analysis. Contribution/Results: First, the activation functions in the initial KAN layer approximate the log-likelihood ratio—conferring inherent statistical interpretability. Second, deeper KAN layers learn increasingly complex, higher-order feature representations. Third, lightweight KANs achieve accuracy within <5% of MLP baselines (maintaining >95% of MLP performance) while substantially enhancing model transparency. Although KANs do not exhibit parameter efficiency gains over MLPs, this study is the first to demonstrate their distinctive trade-off in HEP contexts: a modest, controllable accuracy penalty for significantly strengthened interpretability—enabling physics-informed model diagnosis and trustworthiness.

Technology Category

Application Category

📝 Abstract

Recently, Kolmogorov-Arnold Networks (KANs) have been proposed as an alternative to multilayer perceptrons, suggesting advantages in performance and interpretability. We study a typical binary event classification task in high-energy physics including high-level features and comment on the performance and interpretability of KANs in this context. We find that the learned activation functions of a one-layer KAN resemble the log-likelihood ratio of the input features. In deeper KANs, the activations in the first KAN layer differ from those in the one-layer KAN, which indicates that the deeper KANs learn more complex representations of the data. We study KANs with different depths and widths and we compare them to multilayer perceptrons in terms of performance and number of trainable parameters. For the chosen classification task, we do not find that KANs are more parameter efficient. However, small KANs may offer advantages in terms of interpretability that come at the cost of only a moderate loss in performance.

Problem

Research questions and friction points this paper is trying to address.

Evaluate KANs for HEP binary event classification performance

Compare KANs and MLPs in parameter efficiency and interpretability

Analyze learned activation functions in shallow vs deep KANs

Innovation

Methods, ideas, or system contributions that make the work stand out.

KANs replace multilayer perceptrons for HEP tasks

KANs learn complex data representations in deep layers

Small KANs balance interpretability and performance

🔎 Similar Papers

P1-KAN: an effective Kolmogorov-Arnold network with application to hydraulic valley optimization