🤖 AI Summary
To address the lack of interpretability in hyperparameter optimization (HPO), this paper proposes the first game-theoretic explanation framework grounded in Shapley values and higher-order interactions, enabling additive decomposition of model performance over the hyperparameter space. Methodologically, it introduces Shapley interaction analysis—previously unexplored in HPO—establishing a unified paradigm for quantifying both individual hyperparameter importance and multi-order interactions, supporting both local and global attribution as well as coupled effect analysis. Extensive experiments across multiple HPO benchmarks reveal that low-order interactions—particularly second-order ones—dominate performance variation, uncovering critical insights into algorithm tunability, optimizer behavior, and ablation effects. The framework enhances transparency and user trust in AutoML systems, providing both theoretical foundations and practical tools for interpretable hyperparameter debugging.
📝 Abstract
Hyperparameter optimization (HPO) is a crucial step in achieving strong predictive performance. However, the impact of individual hyperparameters on model generalization is highly context-dependent, prohibiting a one-size-fits-all solution and requiring opaque automated machine learning (AutoML) systems to find optimal configurations. The black-box nature of most AutoML systems undermines user trust and discourages adoption. To address this, we propose a game-theoretic explainability framework for HPO that is based on Shapley values and interactions. Our approach provides an additive decomposition of a performance measure across hyperparameters, enabling local and global explanations of hyperparameter importance and interactions. The framework, named HyperSHAP, offers insights into ablations, the tunability of learning algorithms, and optimizer behavior across different hyperparameter spaces. We evaluate HyperSHAP on various HPO benchmarks by analyzing the interaction structure of the HPO problem. Our results show that while higher-order interactions exist, most performance improvements can be explained by focusing on lower-order representations.