🤖 AI Summary
To address the challenges of identifying motion direction and ensuring robust tracking for sub-pixel-scale targets under low sampling rates (60–240 Hz), this paper proposes STMDNet, a lightweight direction-aware network. Its key contributions are: (1) a same-side excitation–contralateral leakage suppression mechanism that enhances directional discrimination under weak visual cues; (2) a single-correlation directional encoding–decoding strategy, reducing computational overhead to one-eighth that of state-of-the-art (SOTA) methods; and (3) a dual-path neural dynamics modeling framework coupled with model-driven inference. Evaluated on real-world low-sampling-rate data, STMDNet-F achieves 8–19% higher mean F1-score and a 24% improvement in AUC over prior approaches. Running on a single CPU thread, it attains 87 FPS—significantly outperforming existing deep learning methods while establishing new SOTA performance.
📝 Abstract
Recognizing motions of tiny targets - only few dozen pixels - in cluttered backgrounds remains a fundamental challenge when standard feature-based or deep learning methods fail under scarce visual cues. We propose STMDNet, a model-based computational framework to Recognize motions of tiny targets at variable velocities under low-sampling frequency scenarios. STMDNet designs a novel dual-dynamics-and-correlation mechanism, harnessing ipsilateral excitation to integrate target cues and leakage-enhancing-type contralateral inhibition to suppress large-object and background motion interference. Moreover, we develop the first collaborative directional encoding-decoding strategy that determines the motion direction from only one correlation per spatial location, cutting computational costs to one-eighth of prior methods. Further, simply substituting the backbone of a strong STMD model with STMDNet raises AUC by 24%, yielding an enhanced STMDNet-F. Evaluations on real-world low sampling frequency datasets show state-of-the-art results, surpassing the deep learning baseline. Across diverse speeds, STMDNet-F improves mF1 by 19%, 16%, and 8% at 240Hz, 120Hz, and 60Hz, respectively, while STMDNet achieves 87 FPS on a single CPU thread. These advances highlight STMDNet as a next-generation backbone for tiny target motion pattern recognition and underscore its broader potential to revitalize model-based visual approaches in motion detection.