🤖 AI Summary
This work addresses the challenge of accurately tracking fast-moving, small objects—such as squash balls—under irregular bouncing dynamics and weak visual features. We systematically evaluate five Kalman filter–based multi-object trackers—OCSORT, DeepOCSORT, ByteTrack, BoTSORT, and StrongSORT—on a custom high-resolution dataset comprising 10,000 annotated frames across diverse scenarios. Results reveal pervasive spatial drift (3–11 cm) and localization errors 3–4× higher than conventional benchmarks, exposing an inherent trade-off between update frequency and inference speed in existing methods. DeepOCSORT achieves the highest accuracy (ADE: 31.15 pixels), while ByteTrack attains the fastest inference (26.6 ms). To our knowledge, this is the first study to quantitatively characterize the fundamental performance bottlenecks in high-speed, micro-object tracking. Our findings underscore the necessity of designing task-specific tracker architectures tailored to such extreme motion and appearance constraints.
📝 Abstract
Unpredictable movement patterns and small visual mark make precise tracking of fast-moving tiny objects like a racquetball one of the challenging problems in computer vision. This challenge is particularly relevant for sport robotics applications, where lightweight and accurate tracking systems can improve robot perception and planning capabilities. While Kalman filter-based tracking methods have shown success in general object tracking scenarios, their performance degrades substantially when dealing with rapidly moving objects that exhibit irregular bouncing behavior. In this study, we evaluate the performance of five state-of-the-art Kalman filter-based tracking methods-OCSORT, DeepOCSORT, ByteTrack, BoTSORT, and StrongSORT-using a custom dataset containing 10,000 annotated racquetball frames captured at 720p-1280p resolution. We focus our analysis on two critical performance factors: inference speed and update frequency per image, examining how these parameters affect tracking accuracy and reliability for fast-moving tiny objects. Our experimental evaluation across four distinct scenarios reveals that DeepOCSORT achieves the lowest tracking error with an average ADE of 31.15 pixels compared to ByteTrack's 114.3 pixels, while ByteTrack demonstrates the fastest processing at 26.6ms average inference time versus DeepOCSORT's 26.8ms. However, our results show that all Kalman filter-based trackers exhibit significant tracking drift with spatial errors ranging from 3-11cm (ADE values: 31-114 pixels), indicating fundamental limitations in handling the unpredictable motion patterns of fast-moving tiny objects like racquetballs. Our analysis demonstrates that current tracking approaches require substantial improvements, with error rates 3-4x higher than standard object tracking benchmarks, highlighting the need for specialized methodologies for fast-moving tiny object tracking applications.