🤖 AI Summary
This work addresses the limitation of traditional multi-objective optimization methods, which overlook the dynamic nature of hyperparameter importance under varying objective trade-offs, often resulting in inefficient search and suboptimal solutions. To overcome this, we propose a novel approach that integrates HyperSHAP-based dynamic hyperparameter importance evaluation into the ParEGO optimization framework. By leveraging real-time objective weights, our method adaptively prunes the configuration space, fixing less influential hyperparameters to concentrate the search on critical dimensions. Combining HyperSHAP, ParEGO, and an adaptive space-reduction mechanism, the proposed algorithm significantly accelerates convergence and improves the quality of the Pareto front, as demonstrated on the PyMOO and YAHPO-Gym benchmarks.
📝 Abstract
Choosing a suitable ML model is a complex task that can depend on several objectives, e.g., accuracy, model size, fairness, inference time, or energy consumption. In practice, this requires trading off multiple, often competing, objectives through multi-objective optimization (MOO). However, existing MOO methods typically treat all hyperparameters as equally important, overlooking that hyperparameter importance (HPI) can vary significantly depending on the trade-off between objectives. We propose a novel dynamic optimization approach that prioritizes the most influential hyperparameters based on varying objective trade-offs during the search process, which accelerates empirical convergence and leads to better solutions. Building on prior work on HPI for MOO post-analysis, we now integrate HPI, calculated with HyperSHAP, into the optimization. For this, we leverage the objective weightings naturally produced by the MOO algorithm ParEGO and adapt the configuration space by fixing the unimportant hyperparameters, allowing the search to focus on the important ones. Eventually, we validate our method with diverse tasks from PyMOO and YAHPO-Gym. Empirical results demonstrate improvements in convergence speed and Pareto front quality compared to baselines.