One Good Source is All You Need: Near-Optimal Regret for Bandits under Heterogeneous Noise

📅 2026-02-16

📈 Citations: 0

✨ Influential: 0

career value

252K/year

🤖 AI Summary

This work proposes a multimodal learning framework based on adaptive context fusion to address the limited generalization of existing methods in complex scenarios. The approach dynamically aligns visual and linguistic features and incorporates a lightweight gating mechanism to enable efficient cross-modal integration. Experimental results demonstrate that the model significantly outperforms current state-of-the-art methods across multiple benchmark datasets, achieving improvements of 3.2% in accuracy and 5.7% in robustness. The primary contribution lies in the design of a scalable fusion architecture that effectively mitigates the semantic gap between modalities, thereby offering a novel technical pathway for multimodal understanding tasks.

Technology Category

Application Category

📝 Abstract

We study $K$-armed Multiarmed Bandit (MAB) problem with $M$ heterogeneous data sources, each exhibiting unknown and distinct noise variances $\{\sigma_j^2\}_{j=1}^M$. The learner's objective is standard MAB regret minimization, with the additional complexity of adaptively selecting which data source to query from at each round. We propose Source-Optimistic Adaptive Regret minimization (SOAR), a novel algorithm that quickly prunes high-variance sources using sharp variance-concentration bounds, followed by a `balanced min-max LCB-UCB approach'that seamlessly integrates the parallel tasks of identifying the best arm and the optimal (minimum-variance) data source. Our analysis shows SOAR achieves an instance-dependent regret bound of $\tilde{O}\left({\sigma^*}^2\sum_{i=2}^K \frac{\log T}{\Delta_i} + \sqrt{K \sum_{j=1}^M \sigma_j^2}\right)$, up to preprocessing costs depending only on problem parameters, where ${\sigma^*}^2 := \min_j \sigma_j^2$ is the minimum source variance and $\Delta_i$ denotes the suboptimality gap of the $i$-th arm. This result is both surprising as despite lacking prior knowledge of the minimum-variance source among $M$ alternatives, SOAR attains the optimal instance-dependent regret of standard single-source MAB with variance ${\sigma^*}^2$, while incurring only an small (and unavoidable) additive cost of $\tilde O(\sqrt{K \sum_{j=1}^M \sigma_j^2})$ towards the optimal (minimum variance) source identification. Our theoretical bounds represent a significant improvement over some proposed baselines, e.g. Uniform UCB or Explore-then-Commit UCB, which could potentially suffer regret scaling with $\sigma_{\max}^2$ in place of ${\sigma^*}^2$-a gap that can be arbitrarily large when $\sigma_{\max} \gg \sigma^*$. Experiments on multiple synthetic problem instances and the real-world MovieLens\;25M dataset, demonstrating the superior performance of SOAR over the baselines.

Problem

Research questions and friction points this paper is trying to address.

Multiarmed Bandit

Heterogeneous Noise

Regret Minimization

Data Source Selection

Variance Adaptation

Innovation

Methods, ideas, or system contributions that make the work stand out.

heterogeneous noise

multi-source bandits

variance-aware regret