🤖 AI Summary
This paper addresses high-dimensional matrix completion under nonignorable missingness mechanisms. To tackle this challenge, we propose a novel method that balances computational efficiency with statistical interpretability. Our approach introduces the first unified framework explicitly designed for nonignorable missing data, featuring a row-column bidirectionally decoupled matrix U-statistic pseudo-likelihood loss function, regularized by the nuclear norm. Optimization is performed via a singular value soft-thresholding gradient algorithm. We establish a tight nonasymptotic upper bound on the estimation error in the Frobenius norm, supported by rigorous theoretical analysis. Extensive experiments on both synthetic and real-world datasets demonstrate that our method consistently outperforms existing approaches predicated on the ignorable missingness assumption—achieving superior imputation accuracy and robustness. The proposed framework thus advances both the theoretical understanding and practical applicability of matrix completion in realistic missing-data scenarios.
📝 Abstract
In this study, we establish a unified framework to deal with the high dimensional matrix completion problem under flexible nonignorable missing mechanisms. Although the matrix completion problem has attracted much attention over the years, there are very sparse works that consider the nonignorable missing mechanism. To address this problem, we derive a row- and column-wise matrix U-statistics type loss function, with the nuclear norm for regularization. A singular value proximal gradient algorithm is developed to solve the proposed optimization problem. We prove the non-asymptotic upper bound of the estimation error's Frobenius norm and show the performance of our method through numerical simulations and real data analysis.