Sample and Expand: Discovering Low-rank Submatrices With Quality Guarantees

📅 2025-06-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Real-world data matrices often lack global low-rank structure but exhibit local low-rank substructures. To address this, we propose the first framework for discovering locally low-rank submatrices that simultaneously provides theoretical guarantees and computational efficiency. Our method employs a two-stage “random sampling–greedy expansion” strategy: (i) initial candidate submatrices are identified via column/row sampling; (ii) these candidates are iteratively expanded using SVD-based low-rank approximation and an error-monitoring mechanism that strictly bounds the spectral-norm deviation from the optimal low-rank approximation. We establish a provable relative error bound for the recovered submatrices. Experiments on diverse real-world datasets demonstrate that our approach significantly outperforms existing baselines in localization accuracy, error controllability, and scalability robustness—achieving, for the first time, large-scale local low-rank pattern mining with rigorous quality guarantees.

Technology Category

Application Category

📝 Abstract
The problem of approximating a matrix by a low-rank one has been extensively studied. This problem assumes, however, that the whole matrix has a low-rank structure. This assumption is often false for real-world matrices. We consider the problem of discovering submatrices from the given matrix with bounded deviations from their low-rank approximations. We introduce an effective two-phase method for this task: first, we use sampling to discover small nearly low-rank submatrices, and then they are expanded while preserving proximity to a low-rank approximation. An extensive experimental evaluation confirms that the method we introduce compares favorably to existing approaches.
Problem

Research questions and friction points this paper is trying to address.

Discover low-rank submatrices with quality guarantees
Approximate matrices with bounded low-rank deviations
Combine sampling and expansion for effective submatrix discovery
Innovation

Methods, ideas, or system contributions that make the work stand out.

Sampling to discover low-rank submatrices
Expanding submatrices with quality guarantees
Two-phase method for matrix approximation
🔎 Similar Papers
No similar papers found.
Martino Ciaperoni
Martino Ciaperoni
Scuola Normale Superiore
Trustworthy AIMachine LearningData Mining
A
A. Gionis
KTH Royal Institute of Technology, Sweden
H
H. Mannila
Aalto University, Finland