🤖 AI Summary
Deep state-space models (SSMs) suffer from poor generalization in long-sequence modeling, and existing PAC generalization bounds typically scale with sequence length, limiting theoretical guarantees for long-horizon tasks.
Method: This work establishes the first input-length-independent PAC generalization bound for deep SSMs by integrating PAC learning theory, stability analysis of SSMs—particularly spectral radius constraints—and structural properties of canonical deep SSM architectures (e.g., S4, S5, LRU).
Contribution/Results: We derive a tight generalization bound for stable SSM blocks and rigorously prove that the bound monotonically decreases with increasing stability—i.e., stability and generalization error exhibit a quantifiable negative correlation. This provides the first theoretical justification for stability-driven architectural design in deep SSMs, overcoming the fundamental limitation of length-dependent bounds. The result significantly enhances interpretability and reliability of deep SSMs in long-term time-series modeling.
📝 Abstract
Many state-of-the-art models trained on long-range sequences, for example S4, S5 or LRU, are made of sequential blocks combining State-Space Models (SSMs) with neural networks. In this paper we provide a PAC bound that holds for these kind of architectures with stable SSM blocks and does not depend on the length of the input sequence. Imposing stability of the SSM blocks is a standard practice in the literature, and it is known to help performance. Our results provide a theoretical justification for the use of stable SSM blocks as the proposed PAC bound decreases as the degree of stability of the SSM blocks increases.