π€ AI Summary
This paper addresses tensor completion under nonignorable missingness (MNAR), where missingness probabilities depend on unobserved tensor entries and observations are noisyβa setting previously lacking a unified statistical framework. We propose the first general inferential framework applicable to continuous, binary, and count-valued tensors. Our method jointly models the underlying low-rank tensor structure and the missingness mechanism via a generalized linear model for selection probabilities, and estimates parameters via an alternating maximization algorithm. We establish the first non-asymptotic iteration-wise error bounds for tensor estimation under MNAR and derive theoretical conditions for testability of the MNAR mechanism. Experiments on synthetic data and two real-world datasets demonstrate that our approach significantly outperforms conventional tensor completion methods assuming ignorable missingness (MAR/MCAR), achieving both statistical rigor and practical robustness.
π Abstract
Tensor completion plays a crucial role in applications such as recommender systems and medical imaging, where data are often highly incomplete. While extensive prior work has addressed tensor completion with data missingness, most assume that each entry of the tensor is available independently with probability $p$. However, real-world tensor data often exhibit missing-not-at-random (MNAR) patterns, where the probability of missingness depends on the underlying tensor values. This paper introduces a generalized tensor completion framework for noisy data with MNAR, where the observation probability is modeled as a function of underlying tensor values. Our flexible framework accommodates various tensor data types, such as continuous, binary and count data. For model estimation, we develop an alternating maximization algorithm and derive non-asymptotic error bounds for the estimator at each iteration, under considerably relaxed conditions on the observation probabilities. Additionally, we propose a statistical inference procedure to test whether observation probabilities depend on underlying tensor values, offering a formal assessment of the missingness assumption within our modeling framework. The utility and efficacy of our approach are demonstrated through comparative simulation studies and analyses of two real-world datasets.