AIRE-Prune: Asymptotic Impulse-Response Energy for State Pruning in State Space Models

📅 2026-01-31

📈 Citations: 0

✨ Influential: 0

career value

256K/year

🤖 AI Summary

This work addresses the substantial computational and memory overhead of state space models caused by high-dimensional latent states, which often forces compromises in model capacity or stability. The authors propose a structured post-training pruning method that introduces asymptotic impulse response energy as a novel metric for state importance. By minimizing long-term output energy distortion, the approach enables global, cross-layer pruning and extends modal truncation to deep stacked architectures. Leveraging a closed-form energy score and inter-layer normalization, the method operates without requiring retraining. Evaluated across multiple sequence modeling benchmarks, it achieves an average pruning ratio of 60.8% with only a 0.29% drop in accuracy, substantially reducing computational costs while preserving performance.

Technology Category

Application Category

📝 Abstract

State space models (SSMs) often sacrifice capacity, search space, or stability to offset the memory and compute costs of large state dimensions. We introduce a structured post-training pruning method for SSMs -- AIRE-Prune (Asymptotic Impulse-Response Energy for State PRUN(E)) -- that reduces each layer's state dimension by directly minimizing long-run output-energy distortion. AIRE-Prune assigns every state a closed-form asymptotic impulse-response energy-based score, i.e., the total impulse-response energy it contributes over an infinite horizon (time), and normalizes these scores layer-wise to enable global cross-layer comparison and selection. This extends modal truncation from single systems to deep stacks and aligns pruning with asymptotic response energy rather than worst-case gain. Across diverse sequence benchmarks, AIRE-Prune reveals substantial redundancy in SISO and MIMO SSMs with average pruning of 60.8%, with average accuracy drop of 0.29% without retraining, while significantly lowering compute. Code: https://github.com/falcon-arrow/AIRE-Prune.

Problem

Research questions and friction points this paper is trying to address.

state space models

state pruning

asymptotic impulse-response energy

model compression

compute efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

AIRE-Prune

state space models

asymptotic impulse-response energy