Efficient Over-parameterized Matrix Sensing from Noisy Measurements via Alternating Preconditioned Gradient Descent

📅 2025-02-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the robust sensing of unknown-rank matrices under noisy conditions. We tackle two key challenges: the true rank $r^star$ is unknown, and the signal-to-noise ratio is low. To this end, we propose an over-parameterized factorization model $L R^ op$ with $r > r^star$, coupled with an Alternating Preconditioned Gradient Descent (APGD) algorithm. Unlike conventional preconditioned or gradient-based methods, APGD eliminates the need for a damping parameter $lambda$, accommodates arbitrary random initialization, permits larger step sizes, and achieves linear convergence in the nonconvex setting. Theoretically, APGD attains a near-optimal estimation error bound. Compared to existing preconditioned and gradient descent approaches, APGD demonstrates faster convergence and significantly improved initialization robustness. Extensive experiments validate its superior empirical performance across diverse noisy regimes.

Technology Category

Application Category

📝 Abstract
We consider the noisy matrix sensing problem in the over-parameterization setting, where the estimated rank $r$ is larger than the true rank $r_star$. Specifically, our main objective is to recover a matrix $ X_star in mathbb{R}^{n_1 imes n_2} $ with rank $ r_star $ from noisy measurements using an over-parameterized factorized form $ LR^ op $, where $ L in mathbb{R}^{n_1 imes r}, , R in mathbb{R}^{n_2 imes r} $ and $ min{n_1, n_2} ge r>r_star $, with the true rank $ r_star $ being unknown. Recently, preconditioning methods have been proposed to accelerate the convergence of matrix sensing problem compared to vanilla gradient descent, incorporating preconditioning terms $ (L^ op L + lambda I)^{-1} $ and $ (R^ op R + lambda I)^{-1} $ into the original gradient. However, these methods require careful tuning of the damping parameter $lambda$ and are sensitive to initial points and step size. To address these limitations, we propose the alternating preconditioned gradient descent (APGD) algorithm, which alternately updates the two factor matrices, eliminating the need for the damping parameter and enabling faster convergence with larger step sizes. We theoretically prove that APGD achieves near-optimal error convergence at a linear rate, starting from arbitrary random initializations. Through extensive experiments, we validate our theoretical results and demonstrate that APGD outperforms other methods, achieving the fastest convergence rate. Notably, both our theoretical analysis and experimental results illustrate that APGD does not rely on the initialization procedure, making it more practical and versatile.
Problem

Research questions and friction points this paper is trying to address.

Matrix Recovery
Noisy Measurements
Unknown Rank
Innovation

Methods, ideas, or system contributions that make the work stand out.

APGD Algorithm
Matrix Recovery
Noise Resilience
🔎 Similar Papers
No similar papers found.
Z
Zhiyu Liu
State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, P.R. China, and also with the University of Chinese Academy of Sciences, Beijing 100049, China
Zhi Han
Zhi Han
SIA, CAS
Computer Vision
Yandong Tang
Yandong Tang
中国科学院沈阳自动化研究所教授
计算机视觉、图像处理、模式识别
H
Hai Zhang
Department of Statistics, Northwest University, Xi’an 710000, P.R. China
Shaojie Tang
Shaojie Tang
University at Buffalo
OptimizationMachine Learning
Y
Yao Wang
Center for Intelligent Decision-making and Machine Learning, School of Management, Xi’an Jiaotong University, Xi’an 710049, P.R. China