🤖 AI Summary
Current state-space models (e.g., Mamba) employ a static forgetting gate matrix (A), rendering them ill-suited for long-horizon human activity forecasting where memory demands dynamically evolve over time—leading to inflexible memory decay. To address this, we propose DynaSSM: an input-driven mixture-of-experts architecture that enables observation-dependent, dynamic selection of the A matrix within state-space modeling, thereby endowing temporal memory propagation with contextual sensitivity. Crucially, DynaSSM achieves fine-grained control over state decay without incurring additional inference-time computational overhead. Evaluated on three benchmark human behavior datasets—50Salads, Breakfast, and Assembly101—DynaSSM consistently outperforms existing state-of-the-art methods, demonstrating both the effectiveness and generalizability of dynamic forgetting gates for improving long-horizon action prediction in realistic, complex scenarios.
📝 Abstract
We present MixANT, a novel architecture for stochastic long-term dense anticipation of human activities. While recent State Space Models (SSMs) like Mamba have shown promise through input-dependent selectivity on three key parameters, the critical forget-gate ($ extbf{A}$ matrix) controlling temporal memory remains static. We address this limitation by introducing a mixture of experts approach that dynamically selects contextually relevant $ extbf{A}$ matrices based on input features, enhancing representational capacity without sacrificing computational efficiency. Extensive experiments on the 50Salads, Breakfast, and Assembly101 datasets demonstrate that MixANT consistently outperforms state-of-the-art methods across all evaluation settings. Our results highlight the importance of input-dependent forget-gate mechanisms for reliable prediction of human behavior in diverse real-world scenarios.