🤖 AI Summary
Real-world data matrices often lack global low-rank structure but exhibit local low-rank substructures. To address this, we propose the first framework for discovering locally low-rank submatrices that simultaneously provides theoretical guarantees and computational efficiency. Our method employs a two-stage “random sampling–greedy expansion” strategy: (i) initial candidate submatrices are identified via column/row sampling; (ii) these candidates are iteratively expanded using SVD-based low-rank approximation and an error-monitoring mechanism that strictly bounds the spectral-norm deviation from the optimal low-rank approximation. We establish a provable relative error bound for the recovered submatrices. Experiments on diverse real-world datasets demonstrate that our approach significantly outperforms existing baselines in localization accuracy, error controllability, and scalability robustness—achieving, for the first time, large-scale local low-rank pattern mining with rigorous quality guarantees.
📝 Abstract
The problem of approximating a matrix by a low-rank one has been extensively studied. This problem assumes, however, that the whole matrix has a low-rank structure. This assumption is often false for real-world matrices. We consider the problem of discovering submatrices from the given matrix with bounded deviations from their low-rank approximations. We introduce an effective two-phase method for this task: first, we use sampling to discover small nearly low-rank submatrices, and then they are expanded while preserving proximity to a low-rank approximation. An extensive experimental evaluation confirms that the method we introduce compares favorably to existing approaches.