Spectral Thompson sampling

📅 2026-04-15
📈 Citations: 0
Influential: 0
📄 PDF

career value

221K/year
🤖 AI Summary
This work addresses the multi-armed bandit problem on graph-structured arms, where neighboring nodes exhibit similar expected rewards—a setting in which conventional approaches struggle to scale with large action spaces. The authors propose SpectralTS, an algorithm that integrates Thompson sampling with spectral graph analysis, leveraging a low-dimensional spectral representation derived from the graph Laplacian to enable efficient Bayesian inference. By introducing the effective dimension $d$ of the graph to characterize problem complexity, SpectralTS achieves a theoretically optimal regret bound of $O(d\sqrt{T \ln N})$ while substantially improving scalability. Empirical evaluations on both synthetic and real-world datasets demonstrate that SpectralTS consistently delivers strong performance alongside high computational efficiency.

Technology Category

Application Category

📝 Abstract
Thompson Sampling (TS) has attracted a lot of interest due to its good empirical performance, in particular in the computational advertising. Though successful, the tools for its performance analysis appeared only recently. In this paper, we describe and analyze SpectralTS algorithm for a bandit problem, where the payoffs of the choices are smooth given an underlying graph. In this setting, each choice is a node of a graph and the expected payoffs of the neighboring nodes are assumed to be similar. Although the setting has application both in recommender systems and advertising, the traditional algorithms would scale poorly with the number of choices. For that purpose we consider an effective dimension d, which is small in real-world graphs. We deliver the analysis showing that the regret of SpectralTS scales as d*sqrt(T ln N) with high probability, where T is the time horizon and N is the number of choices. Since a d*sqrt(T ln N) regret is comparable to the known results, SpectralTS offers a computationally more efficient alternative. We also show that our algorithm is competitive on both synthetic and real-world data.
Problem

Research questions and friction points this paper is trying to address.

bandit problem
graph structure
smooth payoffs
scalability
Thompson sampling
Innovation

Methods, ideas, or system contributions that make the work stand out.

Spectral Thompson Sampling
graph bandits
effective dimension
regret analysis
smooth payoffs
🔎 Similar Papers
No similar papers found.