Optimal Transfer Learning for Missing Not-at-Random Matrix Completion

📅 2025-02-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses matrix completion for a target matrix $Q$ under missing-not-at-random (MNAR) mechanisms, where entire rows or columns are missing—a common scenario in biological applications such as single-cell multi-omics. Without auxiliary information, only a noisy and severely incomplete source matrix $P$ is available, whose latent factors exhibit feature shift relative to those of $Q$. We propose a transfer-learning-based spectral correction estimation framework. First, we establish the minimax optimal error rate for MNAR matrix completion under active sampling—bypassing conventional unidentifiability assumptions. Second, we theoretically prove that our estimator achieves the optimal convergence rate. Empirical evaluation on real biological datasets demonstrates significant improvements over state-of-the-art methods, validating the effectiveness, robustness, and computational efficiency of our transfer-based modeling approach.

Technology Category

Application Category

📝 Abstract
We study transfer learning for matrix completion in a Missing Not-at-Random (MNAR) setting that is motivated by biological problems. The target matrix $Q$ has entire rows and columns missing, making estimation impossible without side information. To address this, we use a noisy and incomplete source matrix $P$, which relates to $Q$ via a feature shift in latent space. We consider both the active and passive sampling of rows and columns. We establish minimax lower bounds for entrywise estimation error in each setting. Our computationally efficient estimation framework achieves this lower bound for the active setting, which leverages the source data to query the most informative rows and columns of $Q$. This avoids the need for incoherence assumptions required for rate optimality in the passive sampling setting. We demonstrate the effectiveness of our approach through comparisons with existing algorithms on real-world biological datasets.
Problem

Research questions and friction points this paper is trying to address.

Addresses matrix completion with missing rows and columns.
Uses transfer learning to leverage noisy source data.
Achieves minimax lower bounds for entrywise estimation error.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transfer learning for MNAR matrix completion
Active sampling leverages source data efficiently
Minimax lower bounds achieved without incoherence
🔎 Similar Papers
No similar papers found.