A Best-of-Both-Worlds Proof for Tsallis-INF without Fenchel Conjugates

📅 2025-11-14

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

This paper addresses both stochastic and adversarial multi-armed bandit problems by proposing the first unified analytical framework for the Tsallis-INF algorithm that entirely avoids Fenchel conjugates. Methodologically, it leverages modern tools from online convex optimization—specifically, Bregman divergences and direct characterization of dual updates—to replace conventional, conjugate-dependent derivations. This yields concise, unified proofs of optimal regret bounds: $O(sqrt{KT})$ in the adversarial setting and $Oig(sum_{i:Delta_i>0} frac{log T}{Delta_i}ig)$ in the stochastic setting. The key contribution is the complete elimination of Fenchel conjugates, markedly enhancing theoretical interpretability and analytical scalability. To the best of our knowledge, this work is the first to rigorously establish that Tsallis-INF simultaneously achieves optimal performance across both environments while admitting a significantly simplified, conjugate-free analysis.

Technology Category

Application Category

📝 Abstract

In this short note, we present a simple derivation of the best-of-both-world guarantee for the Tsallis-INF multi-armed bandit algorithm from J. Zimmert and Y. Seldin. Tsallis-INF: An optimal algorithm for stochastic and adversarial bandits. Journal of Machine Learning Research, 22(28):1-49, 2021. URL https://jmlr.csail.mit.edu/papers/volume22/19-753/19-753.pdf. In particular, the proof uses modern tools from online convex optimization and avoid the use of conjugate functions. Also, we do not optimize the constants in the bounds in favor of a slimmer proof.

Problem

Research questions and friction points this paper is trying to address.

Simplifies proof for Tsallis-INF bandit algorithm

Avoids using complex conjugate functions

Focuses on streamlined analysis over constant optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Derives Tsallis-INF guarantees without conjugate functions

Uses modern online convex optimization tools

Simplifies proof by avoiding constant optimization

🔎 Similar Papers

A SIMPLIFIED LOWER BOUND FOR IMPLICATIONAL LOGIC