Multi-Task Representation Learning for Conservative Linear Bandits

📅 2026-05-12
📈 Citations: 0
Influential: 0
📄 PDF

career value

197K/year
🤖 AI Summary
This work addresses the challenge of efficiently learning a shared low-dimensional representation under safety constraints in multi-task linear bandits. The authors propose the CMTRL framework, which for the first time integrates conservative bandit algorithms with multi-task low-rank representation learning. Central to this framework is the Safe-AltGDmin algorithm, which combines alternating projected gradient descent with constrained optimization to simultaneously recover a shared r-dimensional low-rank feature matrix and learn task-specific policies. Theoretical analysis provides upper bounds on both regret and sample complexity, while empirical results demonstrate that the proposed method significantly outperforms existing baselines while adhering to conservative safety constraints.
📝 Abstract
This paper presents the Constrained Multi-Task Representation Learning (CMTRL) framework for linear bandits. We consider T linear bandit tasks in a d dimensional space, which share a common low-dimensional representation of dimension r, where r is much smaller than the minimum of d and T. Furthermore, tasks are constrained so that only actions meeting specific safety or performance requirements are allowed, referred to as conservative (safe) bandits. We introduce a novel algorithm, Safe-Alternating projected Gradient Descent and minimization (Safe-AltGDmin), to recover a low-rank feature matrix while satisfying the given constraints. Building on this algorithm, we propose a multi-task representation learning framework for conservative linear bandits and establish theoretical guarantees for its regret and sample complexity bounds. We presented experiments and compared the performance of our algorithm with benchmark algorithms.
Problem

Research questions and friction points this paper is trying to address.

multi-task representation learning
conservative linear bandits
low-dimensional representation
safety constraints
linear bandits
Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-task representation learning
conservative bandits
low-rank recovery
Safe-AltGDmin
linear bandits
🔎 Similar Papers