🤖 AI Summary
To address the high computational overhead and poor real-time feasibility of deep point trackers on edge devices—caused by frame-wise GPU-based depth inference—this paper proposes a tracker-agnostic hybrid acceleration framework integrating sparse keyframe-based depth updates with lightweight Kalman filtering. The framework introduces, for the first time, a temporal consistency modeling mechanism grounded in Bayesian uncertainty propagation, wherein Kalman filtering replaces depth inference for most frames, thereby significantly reducing computation while preserving trajectory robustness. Evaluated on edge platforms such as Jetson Nano, it achieves 5–10× speedup with >85% of original accuracy, enabling the first real-time point tracking under severe resource constraints. Its core contribution lies in decoupling depth inference from temporal modeling, establishing a new paradigm for edge vision tracking that is scalable, low-latency, and high-accuracy.
📝 Abstract
Point tracking in video sequences is a foundational capability for real-world computer vision applications, including robotics, autonomous systems, augmented reality, and video analysis. While recent deep learning-based trackers achieve state-of-the-art accuracy on challenging benchmarks, their reliance on per-frame GPU inference poses a major barrier to deployment on resource-constrained edge devices, where compute, power, and connectivity are limited. We introduce K-Track (Kalman-enhanced Tracking), a general-purpose, tracker-agnostic acceleration framework designed to bridge this deployment gap. K-Track reduces inference cost by combining sparse deep learning keyframe updates with lightweight Kalman filtering for intermediate frame prediction, using principled Bayesian uncertainty propagation to maintain temporal coherence. This hybrid strategy enables 5-10X speedup while retaining over 85% of the original trackers' accuracy. We evaluate K-Track across multiple state-of-the-art point trackers and demonstrate real-time performance on edge platforms such as the NVIDIA Jetson Nano and RTX Titan. By preserving accuracy while dramatically lowering computational requirements, K-Track provides a practical path toward deploying high-quality point tracking in real-world, resource-limited settings, closing the gap between modern tracking algorithms and deployable vision systems.