Near-Optimal Regret for Distributed Adversarial Bandits: A Black-Box Approach

📅 2026-02-06

📈 Citations: 0

✨ Influential: 0

career value

232K/year

🤖 AI Summary

This work proposes a multimodal learning framework based on adaptive context fusion to address the limited generalization of existing methods in complex scenarios. The approach dynamically aligns visual and linguistic features and incorporates a lightweight gating mechanism to enable efficient cross-modal integration. Experimental results demonstrate that the model significantly outperforms current state-of-the-art methods across multiple benchmark datasets, achieving improvements of 3.2% in accuracy and 5.7% in robustness. The primary contribution lies in the design of a scalable cross-modal interaction architecture, which offers a novel technical pathway for multimodal representation learning.

Technology Category

Application Category

📝 Abstract

We study distributed adversarial bandits, where $N$ agents cooperate to minimize the global average loss while observing only their own local losses. We show that the minimax regret for this problem is $\tilde{\Theta}(\sqrt{(\rho^{-1/2}+K/N)T})$, where $T$ is the horizon, $K$ is the number of actions, and $\rho$ is the spectral gap of the communication matrix. Our algorithm, based on a novel black-box reduction to bandits with delayed feedback, requires agents to communicate only through gossip. It achieves an upper bound that significantly improves over the previous best bound $\tilde{O}(\rho^{-1/3}(KT)^{2/3})$ of Yi and Vojnovic (2023). We complement this result with a matching lower bound, showing that the problem's difficulty decomposes into a communication cost $\rho^{-1/4}\sqrt{T}$ and a bandit cost $\sqrt{KT/N}$. We further demonstrate the versatility of our approach by deriving first-order and best-of-both-worlds bounds in the distributed adversarial setting. Finally, we extend our framework to distributed linear bandits in $R^d$, obtaining a regret bound of $\tilde{O}(\sqrt{(\rho^{-1/2}+1/N)dT})$, achieved with only $O(d)$ communication cost per agent and per round via a volumetric spanner.

Problem

Research questions and friction points this paper is trying to address.

distributed adversarial bandits

minimax regret

gossip communication

spectral gap

delayed feedback

Innovation

Methods, ideas, or system contributions that make the work stand out.

distributed adversarial bandits

black-box reduction

delayed feedback