🤖 AI Summary
Addressing the longstanding trade-off between efficiency and robustness in LiDAR-based 3D single-object tracking, this paper introduces the first trajectory-level tracking paradigm. Instead of relying on raw point clouds at each frame, it implicitly models motion continuity solely from historical detection bounding boxes—enhancing robustness under sparsity and occlusion. The method employs a lightweight two-stage architecture: explicit motion proposal generation followed by implicit trajectory prediction, enabling efficient yet long-term temporal modeling. This paradigm is baseline-agnostic and can be seamlessly integrated into mainstream trackers as a plug-and-play module. On the nuScenes benchmark, it achieves a 4.48% improvement in mATE over strong baselines while maintaining real-time inference at 56 FPS—marking the first instance of substantive balance between accuracy and efficiency in LiDAR 3D tracking.
📝 Abstract
LiDAR-based 3D single object tracking (3D SOT) is a critical task in robotics and autonomous systems. Existing methods typically follow frame-wise motion estimation or a sequence-based paradigm. However, the two-frame methods are efficient but lack long-term temporal context, making them vulnerable in sparse or occluded scenes, while sequence-based methods that process multiple point clouds gain robustness at a significant computational cost. To resolve this dilemma, we propose a novel trajectory-based paradigm and its instantiation, TrajTrack. TrajTrack is a lightweight framework that enhances a base two-frame tracker by implicitly learning motion continuity from historical bounding box trajectories alone-without requiring additional, costly point cloud inputs. It first generates a fast, explicit motion proposal and then uses an implicit motion modeling module to predict the future trajectory, which in turn refines and corrects the initial proposal. Extensive experiments on the large-scale NuScenes benchmark show that TrajTrack achieves new state-of-the-art performance, dramatically improving tracking precision by 4.48% over a strong baseline while running at 56 FPS. Besides, we also demonstrate the strong generalizability of TrajTrack across different base trackers. Video is available at https://www.bilibili.com/video/BV1ahYgzmEWP.