Core-elements Subsampling for Alternating Least Squares

📅 2025-09-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In large-scale recommender systems, low-rank matrix factorization with missing entries suffers from high computational cost in Alternating Least Squares (ALS) due to repeated full-data regressions. This paper proposes an element-wise core subset selection method: guided by theoretical analysis, it identifies critical observed entries that significantly contribute to parameter updates, and performs sparse-matrix-based iterative optimization over this subset. The method preserves convergence guarantees and approximation accuracy while drastically reducing per-iteration computational complexity. We derive theoretical error bounds and sufficient sampling conditions for accurate recovery. Experiments on multiple real-world and synthetic datasets demonstrate that the approach achieves recommendation accuracy comparable to full ALS using only 10%–30% of its runtime. This work establishes a verifiable, highly efficient new paradigm for large-scale sparse matrix factorization.

Technology Category

Application Category

📝 Abstract
In this paper, we propose a novel element-wise subset selection method for the alternating least squares (ALS) algorithm, focusing on low-rank matrix factorization involving matrices with missing values, as commonly encountered in recommender systems. While ALS is widely used for providing personalized recommendations based on user-item interaction data, its high computational cost, stemming from repeated regression operations, poses significant challenges for large-scale datasets. To enhance the efficiency of ALS, we propose a core-elements subsampling method that selects a representative subset of data and leverages sparse matrix operations to approximate ALS estimations efficiently. We establish theoretical guarantees for the approximation and convergence of the proposed approach, showing that it achieves similar accuracy with significantly reduced computational time compared to full-data ALS. Extensive simulations and real-world applications demonstrate the effectiveness of our method in various scenarios, emphasizing its potential in large-scale recommendation systems.
Problem

Research questions and friction points this paper is trying to address.

Reducing computational cost of ALS for large-scale recommender systems
Addressing missing values in low-rank matrix factorization problems
Improving efficiency while maintaining accuracy in personalized recommendations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Core-elements subsampling for ALS algorithm
Leverages sparse matrix operations for efficiency
Provides theoretical guarantees for approximation convergence
🔎 Similar Papers
No similar papers found.
D
Dunyao Xue
Institute of Statistics and Big Data, Renmin University of China, Beijing, China
M
Mengyu Li
Department of Statistics and Data Science, Tsinghua University, Beijing, China
Cheng Meng
Cheng Meng
Institute of Statistics and Big Data, Renmin University of China
Data ScienceOptimal transportSubsamplingSmoothing Spline
J
Jingyi Zhang
School of Science, Department of Mathematics, Beijing University of Posts and Telecommunications, Beijing, China