SportMamba: Adaptive Non-Linear Multi-Object Tracking with State Space Models for Team Sports

📅 2025-06-03

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

In team sports multi-object tracking (MOT), severe identity switches and localization errors arise from high-speed motion, frequent occlusions, and highly nonlinear trajectories. Existing approaches—relying primarily on detection outputs and appearance matching—struggle under ambiguous appearances and non-linear motion dynamics. To address these challenges, we propose a Mamba-Attention hybrid architecture: (1) a state-space model (Mamba) captures long-range, nonlinear motion dependencies; (2) attention-enhanced embeddings and depth-aware adaptive spatial association metrics mitigate scale mismatch and ID fragmentation; and (3) a dynamic detection search buffer improves robustness against detection failures. Our method achieves state-of-the-art performance on SportsMOT. Moreover, with zero-shot transfer to the VIP-HTD ice hockey dataset—despite domain shift in camera setup, player appearance, and motion patterns—it maintains strong generalization, validating both architectural versatility and practical applicability in real-world sports analytics.

Technology Category

Application Category

📝 Abstract

Multi-object tracking (MOT) in team sports is particularly challenging due to the fast-paced motion and frequent occlusions resulting in motion blur and identity switches, respectively. Predicting player positions in such scenarios is particularly difficult due to the observed highly non-linear motion patterns. Current methods are heavily reliant on object detection and appearance-based tracking, which struggle to perform in complex team sports scenarios, where appearance cues are ambiguous and motion patterns do not necessarily follow a linear pattern. To address these challenges, we introduce SportMamba, an adaptive hybrid MOT technique specifically designed for tracking in dynamic team sports. The technical contribution of SportMamba is twofold. First, we introduce a mamba-attention mechanism that models non-linear motion by implicitly focusing on relevant embedding dependencies. Second, we propose a height-adaptive spatial association metric to reduce ID switches caused by partial occlusions by accounting for scale variations due to depth changes. Additionally, we extend the detection search space with adaptive buffers to improve associations in fast-motion scenarios. Our proposed technique, SportMamba, demonstrates state-of-the-art performance on various metrics in the SportsMOT dataset, which is characterized by complex motion and severe occlusion. Furthermore, we demonstrate its generalization capability through zero-shot transfer to VIP-HTD, an ice hockey dataset.

Problem

Research questions and friction points this paper is trying to address.

Tracking players in fast-paced team sports with frequent occlusions

Modeling non-linear motion patterns in dynamic sports scenarios

Reducing identity switches caused by partial occlusions and fast motion

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mamba-attention mechanism models non-linear motion

Height-adaptive metric reduces occlusion ID switches

Adaptive buffers extend detection search space

🔎 Similar Papers

Exploring Learning-based Motion Models in Multi-Object Tracking