Multiple Testing of Linear Forms for Noisy Matrix Completion

📅 2023-12-01

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

🤖 AI Summary

This paper addresses the challenge of multiple linear form testing under low-rank matrix completion in large-scale recommender systems, where statistical inference must account for estimation bias, variance, and inter-entry dependencies. Method: We propose a novel statistical inference framework that constructs a new test statistic with exact marginal and joint asymptotic distributions, integrated with data splitting and symmetric aggregation to mitigate dependency and improve robustness. Contribution/Results: To our knowledge, this is the first method achieving rigorous false discovery rate (FDR) control and high statistical power under low-rank constraints. Theoretically, it guarantees FDR control and bounded power under near-optimal sample complexity. Empirical evaluations on synthetic and real-world recommendation datasets demonstrate superior FDR robustness and significantly higher power compared to state-of-the-art baselines.

📝 Abstract

Many important tasks of large-scale recommender systems can be naturally cast as testing multiple linear forms for noisy matrix completion. These problems, however, present unique challenges because of the subtle bias-and-variance tradeoff of and an intricate dependence among the estimated entries induced by the low-rank structure. In this paper, we develop a general approach to overcome these difficulties by introducing new statistics for individual tests with sharp asymptotics both marginally and jointly, and utilizing them to control the false discovery rate (FDR) via a data splitting and symmetric aggregation scheme. We show that valid FDR control can be achieved with guaranteed power under nearly optimal sample size requirements using the proposed methodology. Extensive numerical simulations and real data examples are also presented to further illustrate its practical merits.

Problem

Research questions and friction points this paper is trying to address.

Addresses multiple testing in noisy matrix completion.

Overcomes bias-variance tradeoff in low-rank structures.

Controls false discovery rate with optimal sample size.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces new statistics for individual tests

Controls false discovery rate via data splitting

Uses symmetric aggregation for optimal sample size

🔎 Similar Papers

No similar papers found.

Authors to Follow