Linear Bandits with Non-i.i.d. Noise

📅 2025-05-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper studies the linear stochastic bandit problem under non-i.i.d. noise, relaxing the standard i.i.d. assumption by allowing temporally dependent sub-Gaussian noise with time-decaying dependence—e.g., geometric mixing. To address this setting, we propose, for the first time, dynamic confidence intervals based on the Sequential Probability Assignment Reduction (SPAR) framework, yielding an optimism-based policy whose exploration bonus explicitly quantifies dependence strength. Our approach integrates sub-Gaussian process modeling, the SPAR paradigm, and mixing-time analysis. Theoretically, we derive a regret upper bound parameterized by the dependence decay rate: under geometric mixing, the regret asymptotically matches that of the i.i.d. case—up to a constant factor depending on the mixing time. This result substantially extends the applicability and theoretical grounding of linear bandit algorithms to weakly dependent environments.

Technology Category

Application Category

📝 Abstract
We study the linear stochastic bandit problem, relaxing the standard i.i.d. assumption on the observation noise. As an alternative to this restrictive assumption, we allow the noise terms across rounds to be sub-Gaussian but interdependent, with dependencies that decay over time. To address this setting, we develop new confidence sequences using a recently introduced reduction scheme to sequential probability assignment, and use these to derive a bandit algorithm based on the principle of optimism in the face of uncertainty. We provide regret bounds for the resulting algorithm, expressed in terms of the decay rate of the strength of dependence between observations. Among other results, we show that our bounds recover the standard rates up to a factor of the mixing time for geometrically mixing observation noise.
Problem

Research questions and friction points this paper is trying to address.

Relaxing i.i.d. noise assumption in linear bandits
Handling sub-Gaussian interdependent decaying noise
Providing regret bounds for geometrically mixing noise
Innovation

Methods, ideas, or system contributions that make the work stand out.

Sub-Gaussian interdependent noise modeling
New confidence sequences via probability assignment
Optimism-based algorithm with regret bounds
🔎 Similar Papers
No similar papers found.