Efficient Swap Regret Minimization in Combinatorial Bandits

📅 2026-02-02

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

This work addresses the challenge posed by the exponentially large action space $N$ in combinatorial bandits, where existing algorithms struggle to simultaneously achieve low regret and computational efficiency. The paper proposes the first efficient algorithm that attains a polylogarithmic swap-regret bound in $N$ while maintaining a per-round computational complexity of only $\mathrm{polylog}(N)$. By integrating techniques from combinatorial optimization and online learning theory, the method introduces a carefully designed sampling and feedback mechanism that enables a low-overhead framework for swap-regret minimization. This approach offers both tight theoretical guarantees and practical scalability across a range of canonical combinatorial settings.

Technology Category

Application Category

📝 Abstract

This paper addresses the problem of designing efficient no-swap regret algorithms for combinatorial bandits, where the number of actions $N$ is exponentially large in the dimensionality of the problem. In this setting, designing efficient no-swap regret translates to sublinear -- in horizon $T$ -- swap regret with polylogarithmic dependence on $N$. In contrast to the weaker notion of external regret minimization - a problem which is fairly well understood in the literature - achieving no-swap regret with a polylogarithmic dependence on $N$ has remained elusive in combinatorial bandits. Our paper resolves this challenge, by introducing a no-swap-regret learning algorithm with regret that scales polylogarithmically in $N$ and is tight for the class of combinatorial bandits. To ground our results, we also demonstrate how to implement the proposed algorithm efficiently -- that is, with a per-iteration complexity that also scales polylogarithmically in $N$ -- across a wide range of well-studied applications.

Problem

Research questions and friction points this paper is trying to address.

swap regret

combinatorial bandits

no-swap regret

regret minimization

large action space

Innovation

Methods, ideas, or system contributions that make the work stand out.

swap regret minimization

combinatorial bandits

polylogarithmic dependence