🤖 AI Summary
This paper addresses the bias induced by nonlinear inverse link functions in generalized low-rank trace regression. We propose a two-stage estimator, GL-LowPopArt: the first stage obtains an initial estimate via nuclear norm regularization; the second stage performs bias correction using a matrix Catoni estimator. Our method achieves, for the first time, nearly instance-optimal local minimax error bounds. We introduce a novel experimental design criterion, GL(π), and extend it to the bilinear dueling bandit setting, deriving a problem-dependent Borda regret bound. Theoretically, our Frobenius norm error bound strictly improves upon those of Fan et al. (2019) and Kang et al. (2022). Empirically, GL-LowPopArt significantly outperforms vectorized baselines on generalized matrix completion and bilinear dueling tasks, demonstrating both strong theoretical guarantees and practical efficacy.
📝 Abstract
We present `GL-LowPopArt`, a novel Catoni-style estimator for generalized low-rank trace regression. Building on `LowPopArt` (Jang et al., 2024), it employs a two-stage approach: nuclear norm regularization followed by matrix Catoni estimation. We establish state-of-the-art estimation error bounds, surpassing existing guarantees (Fan et al., 2019; Kang et al., 2022), and reveal a novel experimental design objective, $mathrm{GL}(pi)$. The key technical challenge is controlling bias from the nonlinear inverse link function, which we address by our two-stage approach. We prove a *local* minimax lower bound, showing that our `GL-LowPopArt` enjoys instance-wise optimality up to the condition number of the ground-truth Hessian. Applications include generalized linear matrix completion, where `GL-LowPopArt` achieves a state-of-the-art Frobenius error guarantee, and **bilinear dueling bandits**, a novel setting inspired by general preference learning (Zhang et al., 2024). Our analysis of a `GL-LowPopArt`-based explore-then-commit algorithm reveals a new, potentially interesting problem-dependent quantity, along with improved Borda regret bound than vectorization (Wu et al., 2024).