🤖 AI Summary
This work addresses trajectory prediction in dense crowds by proposing a novel and efficient approach to modeling dynamic social interactions. The method reformulates interaction modeling as a structured sequential process and introduces a Cycle Mamba module to enable continuous bidirectional information flow. By integrating egocentric grid encoding, social triplet factorization, learnable social gating, and a global scanning strategy, it effectively adapts selective state space models—originally designed for structured sequences—to the unstructured nature of social interactions for the first time. This adaptation enhances interaction representation while preserving linear computational complexity. The resulting architecture seamlessly integrates into flow-matching frameworks, achieving state-of-the-art accuracy across five benchmark datasets while significantly improving parameter efficiency and computational scalability.
📝 Abstract
Human trajectory forecasting is crucial for safe navigation in crowded environments, requiring models that balance accuracy with computational efficiency. Efficiently modeling social interactions is key to performance in dense crowds. Yet, most recent methods rely on attention mechanisms, which are effective at capturing complex dependencies, but incur quadratic computational costs that scale poorly with the growing number of neighbors. Recently, Selective State-Space Models have provided a linear-time alternative; however, their inherently sequential design is misaligned with the unstructured and dynamic nature of social interactions. To address this challenge, we propose Social-Mamba, a forecasting architecture that reformulates social interactions as structured sequential processes. At its core is the Cycle Mamba block, a novel module that enables continuous bidirectional information flow. Social-Mamba organizes agents on an egocentric grid and introduces social triplet factorization, which decomposes interactions into temporal, egocentric, and goal-centric scans. These are dynamically integrated through a learnable social gate and global scan to generate accurate and efficient trajectory predictions. Extensive experiments on five trajectory forecasting benchmarks show that Social-Mamba achieves state-of-the-art accuracy while offering superior parameter efficiency and computational scalability. Furthermore, embedding Social-Mamba into a flow-matching framework further enhances both accuracy and efficiency, establishing it as a flexible and robust foundation for future trajectory forecasting research. The code is publicly available: https://github.com/vita-epfl/Social-Mamba