🤖 AI Summary
For approximate matrix multiplication (AMM) under sliding-window streaming settings, existing methods suffer from high space complexity and struggle to simultaneously ensure timeliness and accuracy. This paper proposes SO-COD, the first algorithm achieving optimal space complexity for sliding-window AMM: *O*((*d*ₓ + *d*ᵧ)/ε) for normalized inputs and *O*((*d*ₓ + *d*ᵧ)/ε · log *R*) for non-normalized inputs—matching the theoretical lower bound. SO-COD maintains a compact, dynamically updated sketch of column contributions via column snapshot tracking and error-controlled adaptation, retaining only the most recent window’s data. Extensive experiments on synthetic and real-world datasets demonstrate that SO-COD outperforms baseline methods by reducing memory consumption by one to two orders of magnitude while guaranteeing high approximation fidelity (relative error < ε).
📝 Abstract
Matrix multiplication is a core operation in numerous applications, yet its exact computation becomes prohibitively expensive as data scales, especially in streaming environments where timeliness is critical. In many real-world scenarios, data arrives continuously, making it essential to focus on recent information via sliding windows. While existing approaches offer approximate solutions, they often suffer from suboptimal space complexities when extended to the sliding-window setting. In this work, we introduce SO-COD, a novel algorithm for approximate matrix multiplication (AMM) in the sliding-window streaming setting, where only the most recent data is retained for computation. Inspired by frequency estimation over sliding windows, our method tracks significant contributions, referred to as snapshots, from incoming data and efficiently updates them as the window advances. Given matrices (oldsymbol{X} in mathbb{R}^{d_x imes n}) and (oldsymbol{Y} in mathbb{R}^{d_y imes n}) for computing (oldsymbol{X} oldsymbol{Y}^T), we analyze two data settings. In the emph{normalized} setting, where each column of the input matrices has a unit (L_2) norm, SO-COD achieves an optimal space complexity of ( Oleft(frac{d_x+d_y}{epsilon}
ight) ). In the emph{unnormalized} setting, where the square of column norms vary within a bounded range ([1, R]), we show that the space requirement is ( Oleft(frac{d_x+d_y}{epsilon}log R
ight) ), which matches the theoretical lower bound for an (epsilon)-approximation guarantee. Extensive experiments on synthetic and real-world datasets demonstrate that SO-COD effectively balances space cost and approximation error, making it a promising solution for large-scale, dynamic streaming matrix multiplication.