🤖 AI Summary
This work addresses the robust recovery of low-rank matrices from noisy linear measurements when the true rank is unknown: standard gradient descent on overparameterized matrix factorizations tends to overfit due to excessive rank specification. To this end, we propose the first theoretically grounded early-stopping strategy based on hold-out validation—uniquely integrating validation monitoring with trajectory analysis of gradient descent—and establish near information-theoretically optimal recovery under the low-rank restricted isometry property (RIP). Our method combines small random initialization, overparameterized factorization, and deep image prior, eliminating reliance on prior knowledge of the true rank while retaining theoretical guarantees. Experiments on image inpainting demonstrate both computational efficiency and strong generalization performance.
📝 Abstract
This paper studies the problem of recovering a low-rank matrix from several noisy random linear measurements. We consider the setting where the rank of the ground-truth matrix is unknown a priori and use an objective function built from a rank-overspecified factored representation of the matrix variable, where the global optimal solutions overfit and do not correspond to the underlying ground truth. We then solve the associated nonconvex problem using gradient descent with small random initialization. We show that as long as the measurement operators satisfy the restricted isometry property (RIP) with its rank parameter scaling with the rank of the ground-truth matrix rather than scaling with the overspecified matrix rank, gradient descent iterations are on a particular trajectory towards the ground-truth matrix and achieve nearly information-theoretically optimal recovery when it is stopped appropriately. We then propose an efficient stopping strategy based on the common hold-out method and show that it detects a nearly optimal estimator provably. Moreover, experiments show that the proposed validation approach can also be efficiently used for image restoration with deep image prior, which over-parameterizes an image with a deep network.