🤖 AI Summary
This paper addresses the challenge of multiple linear form testing under low-rank matrix completion in large-scale recommender systems, where statistical inference must account for estimation bias, variance, and inter-entry dependencies. Method: We propose a novel statistical inference framework that constructs a new test statistic with exact marginal and joint asymptotic distributions, integrated with data splitting and symmetric aggregation to mitigate dependency and improve robustness. Contribution/Results: To our knowledge, this is the first method achieving rigorous false discovery rate (FDR) control and high statistical power under low-rank constraints. Theoretically, it guarantees FDR control and bounded power under near-optimal sample complexity. Empirical evaluations on synthetic and real-world recommendation datasets demonstrate superior FDR robustness and significantly higher power compared to state-of-the-art baselines.
📝 Abstract
Many important tasks of large-scale recommender systems can be naturally cast as testing multiple linear forms for noisy matrix completion. These problems, however, present unique challenges because of the subtle bias-and-variance tradeoff of and an intricate dependence among the estimated entries induced by the low-rank structure. In this paper, we develop a general approach to overcome these difficulties by introducing new statistics for individual tests with sharp asymptotics both marginally and jointly, and utilizing them to control the false discovery rate (FDR) via a data splitting and symmetric aggregation scheme. We show that valid FDR control can be achieved with guaranteed power under nearly optimal sample size requirements using the proposed methodology. Extensive numerical simulations and real data examples are also presented to further illustrate its practical merits.