Bandit Pareto Set Identification: the Fixed Budget Setting

📅 2023-11-07

🏛️ International Conference on Artificial Intelligence and Statistics

📈 Citations: 1

✨ Influential: 0

career value

220K/year

🤖 AI Summary

This paper addresses Pareto-optimal arm identification in fixed-budget multi-objective multi-armed bandits. To address the lack of theoretical guarantees in existing pure-exploration frameworks, we establish the first theoretically grounded framework for Pareto-set identification under a fixed budget and propose the Empirical Gap Elimination (EGE) algorithm family, comprising EGE-SR and EGE-SH variants. EGE integrates multi-objective modeling, adaptive gap estimation, and round-wise elimination. Its error probability converges at an exponential rate governed by an information-theoretic lower bound. We rigorously prove that EGE achieves optimal exponential decay of the misidentification probability. Extensive experiments on synthetic and real-world benchmarks demonstrate that EGE significantly outperforms state-of-the-art baselines in both identification accuracy and robustness.

📝 Abstract

We study a multi-objective pure exploration problem in a multi-armed bandit model. Each arm is associated to an unknown multi-variate distribution and the goal is to identify the distributions whose mean is not uniformly worse than that of another distribution: the Pareto optimal set. We propose and analyze the first algorithms for the emph{fixed budget} Pareto Set Identification task. We propose Empirical Gap Elimination, a family of algorithms combining a careful estimation of the ``hardness to classify'' each arm in or out of the Pareto set with a generic elimination scheme. We prove that two particular instances, EGE-SR and EGE-SH, have a probability of error that decays exponentially fast with the budget, with an exponent supported by an information theoretic lower-bound. We complement these findings with an empirical study using real-world and synthetic datasets, which showcase the good performance of our algorithms.

Problem

Research questions and friction points this paper is trying to address.

Multi-Armed Bandit Problem

Pareto Optimality

Fixed Budget

Innovation

Methods, ideas, or system contributions that make the work stand out.

Empirical Gap Elimination

Pareto Optimal Selection

Multi-Armed Bandit Model

🔎 Similar Papers

Divide and Conquer: Provably Unveiling the Pareto Front with Multi-Objective Reinforcement Learning