Learning-Augmented Frequent Directions

πŸ“… 2025-03-02
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This paper addresses the problem of efficiently estimating dominant singular directions (i.e., top singular vectors) from high-dimensional matrix data streams under a single-pass constraint. To overcome the inherent trade-off between directional accuracy and memory overhead in classical Frequent Directions, we propose Learning-enhanced Frequent Directions (LFD)β€”the first learning-augmented algorithm for matrix streams. LFD integrates insights from Misra-Gries and CountSketch, introducing a prediction-guided deterministic low-rank update and an error-correction mechanism. Theoretically, LFD achieves a tighter error bound than randomized baselines; empirically, it significantly reduces directional estimation error. It retains linear time complexity and O(kd) space complexity, where k is the target rank and d the ambient dimension. Extensive evaluation on multiple benchmarks demonstrates that LFD consistently outperforms state-of-the-art methods.

Technology Category

Application Category

πŸ“ Abstract
An influential paper of Hsu et al. (ICLR'19) introduced the study of learning-augmented streaming algorithms in the context of frequency estimation. A fundamental problem in the streaming literature, the goal of frequency estimation is to approximate the number of occurrences of items appearing in a long stream of data using only a small amount of memory. Hsu et al. develop a natural framework to combine the worst-case guarantees of popular solutions such as CountMin and CountSketch with learned predictions of high frequency elements. They demonstrate that learning the underlying structure of data can be used to yield better streaming algorithms, both in theory and practice. We simplify and generalize past work on learning-augmented frequency estimation. Our first contribution is a learning-augmented variant of the Misra-Gries algorithm which improves upon the error of learned CountMin and learned CountSketch and achieves the state-of-the-art performance of randomized algorithms (Aamand et al., NeurIPS'23) with a simpler, deterministic algorithm. Our second contribution is to adapt learning-augmentation to a high-dimensional generalization of frequency estimation corresponding to finding important directions (top singular vectors) of a matrix given its rows one-by-one in a stream. We analyze a learning-augmented variant of the Frequent Directions algorithm, extending the theoretical and empirical understanding of learned predictions to matrix streaming.
Problem

Research questions and friction points this paper is trying to address.

Improves frequency estimation in data streams using learning-augmented algorithms.
Simplifies and generalizes learning-augmented frequency estimation techniques.
Extends learning-augmentation to matrix streaming for top singular vectors.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Learning-augmented Misra-Gries algorithm improves error rates
Adapts learning-augmentation to high-dimensional frequency estimation
Extends Frequent Directions with learned predictions for matrices