Sequential learning of the Pareto front for multi-objective bandits

📅 2025-01-29

🏛️ International Conference on Artificial Intelligence and Statistics

📈 Citations: 0

✨ Influential: 0

career value

256K/year

🤖 AI Summary

This paper addresses the sequential identification of the Pareto front in multi-objective multi-armed bandits, aiming to exactly identify all Pareto-optimal arms with probability at least $1-delta$. We propose the first computationally efficient algorithm achieving the theoretical optimal sample complexity. Its per-round time complexity is $O(Kp^d)$, significantly improving upon existing methods. The core innovations include: (i) adaptive construction of confidence intervals tailored to multi-objective rewards; (ii) dynamic判定 of Pareto dominance relations; and (iii) an incremental sampling strategy that jointly optimizes statistical accuracy and computational efficiency. We prove that the algorithm achieves the information-theoretic lower bound asymptotically as $delta o 0$. Empirical evaluation demonstrates superior performance in high-dimensional multi-objective settings—achieving higher identification accuracy and faster convergence compared to state-of-the-art baselines.

Technology Category

Application Category

📝 Abstract

We study the problem of sequential learning of the Pareto front in multi-objective multi-armed bandits. An agent is faced with K possible arms to pull. At each turn she picks one, and receives a vector-valued reward. When she thinks she has enough information to identify the Pareto front of the different arm means, she stops the game and gives an answer. We are interested in designing algorithms such that the answer given is correct with probability at least 1-$delta$. Our main contribution is an efficient implementation of an algorithm achieving the optimal sample complexity when the risk $delta$ is small. With K arms in d dimensions p of which are in the Pareto set, the algorithm runs in time O(Kp^d) per round.

Problem

Research questions and friction points this paper is trying to address.

Multi-objective Decision Making

Pareto Optimal Solutions

Probability of Correctness

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-objective Decision-making

Pareto Optimal Solutions

Efficient Algorithm

🔎 Similar Papers

Divide and Conquer: Provably Unveiling the Pareto Front with Multi-Objective Reinforcement Learning