PAC-Bayes Bounds for Multivariate Linear Regression and Linear Autoencoders

📅 2025-12-14

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

Linear autoencoders (LAEs) achieve empirical success in recommender systems, yet lack rigorous generalization guarantees. Method: This paper establishes the first convergent PAC-Bayes generalization bound for multivariate linear regression and bridges LAEs to rank-constrained multivariate regression via exact theoretical equivalence. It further develops an efficient constrained optimization algorithm to scale the bound to large-scale real-world scenarios. Contribution/Results: The derived bound is tight and strongly correlated with ranking metrics—including Recall@K and NDCG@K—demonstrating both theoretical soundness and practical utility across multiple large-scale recommendation benchmarks. Key contributions are: (1) the first convergent PAC-Bayes bound for multivariate linear regression; (2) a rigorous theoretical equivalence between LAEs and rank-constrained regression; and (3) a generalization analysis framework that balances theoretical rigor with computational tractability.

Technology Category

Application Category

📝 Abstract

Linear Autoencoders (LAEs) have shown strong performance in state-of-the-art recommender systems. However, this success remains largely empirical, with limited theoretical understanding. In this paper, we investigate the generalizability -- a theoretical measure of model performance in statistical learning -- of multivariate linear regression and LAEs. We first propose a PAC-Bayes bound for multivariate linear regression, extending the earlier bound for single-output linear regression by Shalaeva et al., and establish sufficient conditions for its convergence. We then show that LAEs, when evaluated under a relaxed mean squared error, can be interpreted as constrained multivariate linear regression models on bounded data, to which our bound adapts. Furthermore, we develop theoretical methods to improve the computational efficiency of optimizing the LAE bound, enabling its practical evaluation on large models and real-world datasets. Experimental results demonstrate that our bound is tight and correlates well with practical ranking metrics such as Recall@K and NDCG@K.

Problem

Research questions and friction points this paper is trying to address.

Extends PAC-Bayes bounds to multivariate linear regression for theoretical generalization analysis.

Applies these bounds to linear autoencoders by interpreting them as constrained regression models.

Develops efficient methods to compute bounds for large-scale models and real-world datasets.

Innovation

Methods, ideas, or system contributions that make the work stand out.

PAC-Bayes bound for multivariate linear regression

Interpret LAEs as constrained regression on bounded data

Optimize bound computationally for large-scale evaluation

🔎 Similar Papers

No similar papers found.

💼 Related Jobs

Machine Learning Engineer, PhD Intern

Instacart

CA, NY, CT, NJ$50—$50 USDWA$47.50—$47.50 USDOR, DE, ME, MA, MD, NH, RI, VT, DC, PA, VA, CO, TX, IL, HI$44—$44 USDAll other states$42—$42 USD

remote

Research Engineer, Monetization AI