Sharp analysis of linear ensemble sampling

πŸ“… 2026-02-08
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the long-standing gap in theoretical understanding of Ensemble Sampling (ES) for linear stochastic bandits, where tight high-probability regret bounds have remained elusive compared to Thompson Sampling. By modeling the exploration mechanism of discrete-time ES as a time-uniform boundary-crossing problem involving multiple independent Brownian motions, we leverage tools from stochastic process theory, high-dimensional probability, and time-uniform concentration inequalities to establish a sharp regret upper bound. We prove that with an ensemble size of Θ(d log n), ES achieves a high-probability regret bound of Γ•(d^{3/2}√n) while maintaining computational complexity comparable to Thompson Sampling, thereby closing the theoretical gap between the two methods. Our analysis further highlights the naturalness and necessity of a continuous-time perspective in deriving such results.

Technology Category

Application Category

πŸ“ Abstract
We analyse linear ensemble sampling (ES) with standard Gaussian perturbations in stochastic linear bandits. We show that for ensemble size $m=\Theta(d\log n)$, ES attains $\tilde O(d^{3/2}\sqrt n)$ high-probability regret, closing the gap to the Thompson sampling benchmark while keeping computation comparable. The proof brings a new perspective on randomized exploration in linear bandits by reducing the analysis to a time-uniform exceedance problem for $m$ independent Brownian motions. Intriguingly, this continuous-time lens is not forced; it appears natural--and perhaps necessary: the discrete-time problem seems to be asking for a continuous-time solution, and we know of no other way to obtain a sharp ES bound.
Problem

Research questions and friction points this paper is trying to address.

linear bandits
ensemble sampling
regret analysis
randomized exploration
stochastic optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

linear ensemble sampling
stochastic linear bandits
Brownian motion
time-uniform concentration
randomized exploration
πŸ”Ž Similar Papers
No similar papers found.