A Tale of Two Problems: Multi-Task Bilevel Learning Meets Equality Constrained Multi-Objective Optimization

📅 2026-05-09

📈 Citations: 0

✨ Influential: 0

career value

172K/year

🤖 AI Summary

This work extends bilevel optimization from single-task to multitask settings and, under the mild assumption that the lower-level problem is merely convex (without requiring strong convexity), establishes the first formal connection between multitask bilevel learning and equality-constrained multiobjective optimization. Leveraging the Karush–Kuhn–Tucker (KKT) conditions, the authors introduce a Pareto stationarity criterion and develop a weighted Chebyshev penalty algorithm for its computation. The proposed method achieves a finite-time convergence rate of $O(S T^{-1/2})$ in both deterministic and stochastic regimes, enabling systematic exploration of the Pareto front. This contribution provides a novel theoretical framework and an efficient computational approach for multitask bilevel learning.

📝 Abstract

In recent years, bilevel optimization (BLO) has attracted significant attention for its broad applications in machine learning. However, most existing works on BLO remain confined to the single-task setting and rely on the lower-level strong convexity assumption, which significantly restricts their applicability to modern machine learning problems of growing complexity. In this paper, we make the first attempt to extend BLO to the multi-task setting under a relaxed lower-level general convexity (LLGC) assumption. To this end, we reformulate the multi-task bilevel learning (MTBL) problem with LLGC into an equality constrained multi-objective optimization (ECMO) problem. However, ECMO itself is a new problem that has not yet been studied in the literature. To address this gap, we first establish a new Karush-Kuhn-Tucker (KKT)-based Pareto stationarity as the convergence criterion for ECMO algorithm design. Based on this foundation, we propose a weighted Chebyshev (WC)-penalty algorithm that achieves a finite-time convergence rate of $O(ST^{-\frac{1}{2})$ to KKT-based Pareto stationarity in both deterministic and stochastic settings, where $S$ denotes the number of objectives, and $T$ is the total iterations. Moreover, by varying the preference vector over the $S$-dimensional simplex, our WC-penalty method systematically explores the Pareto front. Finally, solutions to the ECMO problem translate directly into solutions for the original MTBL problem, thereby closing the loop between these two foundational optimization frameworks.

Problem

Research questions and friction points this paper is trying to address.

bilevel optimization

multi-task learning

equality constrained multi-objective optimization

general convexity

Pareto stationarity

Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-task bilevel learning

equality constrained multi-objective optimization

KKT-based Pareto stationarity