Sparse Linear Bandits with Blocking Constraints

📅 2024-10-26

📈 Citations: 1

✨ Influential: 0

career value

232K/year

🤖 AI Summary

This paper studies the online decision-making problem of high-dimensional sparse linear bandits under extreme data scarcity (where $T ll d$ and $T ll K$) and a *blocking constraint*—each arm can be pulled at most once—motivated by applications such as personalized recommendation and efficient active learning. We propose BSLB, an online algorithm leveraging sparsity-aware exploration, and C-BSLB, a meta-algorithm that eliminates the need to know the true sparsity level $k$. To our knowledge, this is the first work integrating the corralling framework with blocking constraints. We further establish robust offline statistical guarantees for the Lasso estimator under mild sparse eigenvalue conditions. Theoretically, our approach achieves a regret bound of $widetilde{O}ig((1+eta_k)^2 k^{2/3} T^{2/3}ig)$, substantially improving upon prior results. Extensive experiments on multiple real-world datasets validate both the efficacy and practicality of the proposed methods.

Technology Category

Application Category

📝 Abstract

We investigate the high-dimensional sparse linear bandits problem in a data-poor regime where the time horizon is much smaller than the ambient dimension and number of arms. We study the setting under the additional blocking constraint where each unique arm can be pulled only once. The blocking constraint is motivated by practical applications in personalized content recommendation and identification of data points to improve annotation efficiency for complex learning tasks. With mild assumptions on the arms, our proposed online algorithm (BSLB) achieves a regret guarantee of $widetilde{mathsf{O}}((1+eta_k)^2k^{frac{2}{3}} mathsf{T}^{frac{2}{3}})$ where the parameter vector has an (unknown) relative tail $eta_k$ -- the ratio of $ell_1$ norm of the top-$k$ and remaining entries of the parameter vector. To this end, we show novel offline statistical guarantees of the lasso estimator for the linear model that is robust to the sparsity modeling assumption. Finally, we propose a meta-algorithm (C-BSLB) based on corralling that does not need knowledge of optimal sparsity parameter $k$ at minimal cost to regret. Our experiments on multiple real-world datasets demonstrate the validity of our algorithms and theoretical framework.

Problem

Research questions and friction points this paper is trying to address.

Sparse linear bandits in data-poor regime with blocking constraints

Online algorithm for personalized content recommendation efficiency

Regret guarantees without optimal sparsity parameter knowledge

Innovation

Methods, ideas, or system contributions that make the work stand out.

Sparse linear bandits with blocking constraints

Novel lasso estimator robustness

Meta-algorithm without sparsity knowledge

🔎 Similar Papers

Fast and Sample Efficient Multi-Task Representation Learning in Stochastic Contextual Bandits