🤖 AI Summary
This work addresses the issue of regret growing polynomially with the hidden dimension in sequence prediction for linear dynamical systems with long memory. It introduces, for the first time, a combination of the Vovk–Azoury–Warmuth (VAW) second-order online learning algorithm with Universal Sequence Preconditioning (USP), augmented by complex-analytic bounds derived from Chebyshev polynomials. This approach effectively compresses the effective memory length and mitigates gradient explosion, thereby overcoming the performance bottleneck of conventional preconditioning methods whose diameter grows exponentially. Moreover, it extends USP to marginally stable systems with asymmetric or complex eigenvalues. The proposed method achieves a logarithmic regret bound of $O(\log^3 T)$ on arbitrary such systems, significantly improving upon existing results.
📝 Abstract
Sequence prediction methods for dynamical systems with long memory, i.e. marginally stable systems, typically achieve regret that grows polynomially with the hidden dimension of the underlying generative model. Universal Sequence Preconditioning (USP) is a method that compresses any sequence which comes from a linear dynamical system into a "preconditioned" sequence which requires exponentially shorter memory for accurate prediction. However, the preconditioned sequence yields exponentially larger diameters and gradients, hindering USP from unlocking optimal regret bounds. Inspired by the minimum description length principle, we show that the Vovk-Azoury-Warmuth (VAW) algorithm is naturally matched to the USP regime. Indeed, it takes advantage of the memory compression while remaining robust to the exponential explosion of the diameter. We prove that combining USP with VAW achieves astoundingly strong results: for any marginally-stable linear dynamical system, this algorithm achieves polylogarithmic regret $O \left( \log^3 T \right)$ even in the presence of asymmetric hidden transition matrices. Finally, we extend the applicability of USP beyond bounded-spectrum systems by providing new complex-analytic bounds on Chebyshev polynomials, allowing for systems with constant complex arguments.