Efficient Swap Regret Minimization in Combinatorial Bandits

📅 2026-02-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge posed by the exponentially large action space \(N\) in combinatorial bandits, where existing algorithms struggle to simultaneously achieve low regret and computational efficiency. The paper proposes the first efficient algorithm that attains a polylogarithmic swap-regret bound in \(N\) while maintaining a per-round computational complexity of only \(\mathrm{polylog}(N)\). By integrating techniques from combinatorial optimization and online learning theory, the method introduces a carefully designed sampling and feedback mechanism that enables a low-overhead framework for swap-regret minimization. This approach offers both tight theoretical guarantees and practical scalability across a range of canonical combinatorial settings.

Technology Category

Application Category

📝 Abstract
This paper addresses the problem of designing efficient no-swap regret algorithms for combinatorial bandits, where the number of actions $N$ is exponentially large in the dimensionality of the problem. In this setting, designing efficient no-swap regret translates to sublinear -- in horizon $T$ -- swap regret with polylogarithmic dependence on $N$. In contrast to the weaker notion of external regret minimization - a problem which is fairly well understood in the literature - achieving no-swap regret with a polylogarithmic dependence on $N$ has remained elusive in combinatorial bandits. Our paper resolves this challenge, by introducing a no-swap-regret learning algorithm with regret that scales polylogarithmically in $N$ and is tight for the class of combinatorial bandits. To ground our results, we also demonstrate how to implement the proposed algorithm efficiently -- that is, with a per-iteration complexity that also scales polylogarithmically in $N$ -- across a wide range of well-studied applications.
Problem

Research questions and friction points this paper is trying to address.

swap regret
combinatorial bandits
no-swap regret
regret minimization
large action space
Innovation

Methods, ideas, or system contributions that make the work stand out.

swap regret minimization
combinatorial bandits
polylogarithmic dependence
efficient learning algorithm
sublinear regret
🔎 Similar Papers
No similar papers found.
A
A. Kontogiannis
National Technical University of Athens, School of Electrical and Computer Engineering; Archimedes, Athena Research Center, Greece
Vasilis Pollatos
Vasilis Pollatos
Archimedes AI, NKUA
machine learningalgorithms
P
P. Mertikopoulos
Archimedes, Athena Research Center, Greece; Univ. Grenoble Alpes, CNRS, Inria, Grenoble INP, LIG, 38000 Grenoble, France
Ioannis Panageas
Ioannis Panageas
Assistant Professor, University of California, Irvine
AlgorithmsOptimizationLearning in GamesMulti-agent RLStatistics