Generalized Low-Rank Matrix Contextual Bandits with Graph Information

📅 2025-07-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing matrix contextual bandit approaches neglect graph-structured relationships between users and items, resulting in suboptimal policy learning efficiency. To address this, we propose the first matrix-bandit framework that jointly incorporates low-rank structure and graph priors: it introduces graph Laplacian regularization into matrix bandits for the first time, unifying nuclear norm minimization with a graph-based regularizer to model user/item similarities. We further design an efficient algorithm based on graph-basis generalized linear UCB. Theoretically, our method achieves a tighter cumulative regret bound than state-of-the-art approaches. Extensive experiments on synthetic data and multiple real-world recommendation benchmarks demonstrate significant performance gains. Our core innovation lies in the unified modeling of low-rankness and graph structure—thereby balancing expressive power and generalization capability—while enabling principled incorporation of relational side information into sequential decision-making under uncertainty.

Technology Category

Application Category

📝 Abstract
The matrix contextual bandit (CB), as an extension of the well-known multi-armed bandit, is a powerful framework that has been widely applied in sequential decision-making scenarios involving low-rank structure. In many real-world scenarios, such as online advertising and recommender systems, additional graph information often exists beyond the low-rank structure, that is, the similar relationships among users/items can be naturally captured through the connectivity among nodes in the corresponding graphs. However, existing matrix CB methods fail to explore such graph information, and thereby making them difficult to generate effective decision-making policies. To fill in this void, we propose in this paper a novel matrix CB algorithmic framework that builds upon the classical upper confidence bound (UCB) framework. This new framework can effectively integrate both the low-rank structure and graph information in a unified manner. Specifically, it involves first solving a joint nuclear norm and matrix Laplacian regularization problem, followed by the implementation of a graph-based generalized linear version of the UCB algorithm. Rigorous theoretical analysis demonstrates that our procedure outperforms several popular alternatives in terms of cumulative regret bound, owing to the effective utilization of graph information. A series of synthetic and real-world data experiments are conducted to further illustrate the merits of our procedure.
Problem

Research questions and friction points this paper is trying to address.

Integrate graph information into matrix contextual bandits
Combine low-rank structure with graph-based relationships
Improve decision-making policies using Laplacian regularization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates low-rank structure with graph information
Uses nuclear norm and Laplacian regularization
Implements graph-based generalized linear UCB
🔎 Similar Papers
No similar papers found.
Y
Yao Wang
School of Management, Xi’an Jiaotong University, Xi’an, China
Jiannan Li
Jiannan Li
Assistant Professor, Singapore Management University
human-computer interactionhuman-robot interaction
Y
Yue Kang
Microsoft, Washington, United States
S
Shanxing Gao
School of Management, Xi’an Jiaotong University, Xi’an, China
Z
Zhenxin Xiao
School of Management, Xi’an Jiaotong University, Xi’an, China