π€ AI Summary
This paper addresses multi-objective bilevel learning (MOBL)βa novel challenge involving the simultaneous optimization of multiple conflicting objectives under coupled upper- and lower-level variablesβby establishing its theoretical foundations and algorithmic framework. We introduce the concept of *preference-guided Pareto stationary solutions* and propose the Weighted Chebyshev Multi-Hypergradient Descent (WC-MHGD) framework, a unified algorithm supporting both deterministic and stochastic settings, with finite-time convergence guarantees and low oracle query complexity. WC-MHGD is the first method enabling systematic exploration of the Pareto front while efficiently computing Pareto stationary solutions. Extensive experiments validate its theoretical convergence and superior Pareto-front coverage, consistently outperforming existing bilevel and multi-objective optimization baselines. Our work delivers the first solution to MOBL that is both theoretically rigorous and practically scalable.
π Abstract
As machine learning (ML) applications grow increasingly complex in recent years, modern ML frameworks often need to address multiple potentially conflicting objectives with coupled decision variables across different layers. This creates a compelling need for multi-objective bilevel learning (MOBL). So far, however, the field of MOBL remains in its infancy and many important problems remain under-explored. This motivates us to fill this gap and systematically investigate the theoretical and algorithmic foundation of MOBL. Specifically, we consider MOBL problems with multiple conflicting objectives guided by preferences at the upper-level subproblem, where part of the inputs depend on the optimal solution of the lower-level subproblem. Our goal is to develop efficient MOBL optimization algorithms to (1) identify a preference-guided Pareto-stationary solution with low oracle complexity; and (2) enable systematic Pareto front exploration. To this end, we propose a unifying algorithmic framework called weighted-Chebyshev multi-hyper-gradient-descent (WC-MHGD) for both deterministic and stochastic settings with finite-time Pareto-stationarity convergence rate guarantees, which not only implies low oracle complexity but also induces systematic Pareto front exploration. We further conduct extensive experiments to confirm our theoretical results.