CompTrack: Information Bottleneck-Guided Low-Rank Dynamic Token Compression for Point Cloud Tracking

📅 2025-11-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the dual challenges of spatial redundancy (background noise interference) and informational redundancy (inefficient foreground representation) in LiDAR point cloud-based 3D single-object tracking, this paper proposes an end-to-end dynamic token compression framework. Methodologically, grounded in the information bottleneck principle, the framework jointly leverages information-entropy-driven spatial foreground prediction, online singular value decomposition (SVD), and low-rank feature approximation to achieve adaptive background suppression and structured foreground refinement. It further generates lightweight proxy tokens to construct highly compact and discriminative representations. Evaluated on KITTI, nuScenes, and Waymo Open Dataset, the method achieves state-of-the-art (SOTA) performance while maintaining real-time inference speed—90 FPS on a single RTX 3090 GPU—demonstrating a compelling trade-off between accuracy and computational efficiency.

Technology Category

Application Category

📝 Abstract
3D single object tracking (SOT) in LiDAR point clouds is a critical task in computer vision and autonomous driving. Despite great success having been achieved, the inherent sparsity of point clouds introduces a dual-redundancy challenge that limits existing trackers: (1) vast spatial redundancy from background noise impairs accuracy, and (2) informational redundancy within the foreground hinders efficiency. To tackle these issues, we propose CompTrack, a novel end-to-end framework that systematically eliminates both forms of redundancy in point clouds. First, CompTrack incorporates a Spatial Foreground Predictor (SFP) module to filter out irrelevant background noise based on information entropy, addressing spatial redundancy. Subsequently, its core is an Information Bottleneck-guided Dynamic Token Compression (IB-DTC) module that eliminates the informational redundancy within the foreground. Theoretically grounded in low-rank approximation, this module leverages an online SVD analysis to adaptively compress the redundant foreground into a compact and highly informative set of proxy tokens. Extensive experiments on KITTI, nuScenes and Waymo datasets demonstrate that CompTrack achieves top-performing tracking performance with superior efficiency, running at a real-time 90 FPS on a single RTX 3090 GPU.
Problem

Research questions and friction points this paper is trying to address.

Eliminates spatial redundancy from background noise in point clouds
Reduces informational redundancy within foreground objects for efficiency
Compresses dynamic tokens using low-rank approximation for real-time tracking
Innovation

Methods, ideas, or system contributions that make the work stand out.

Spatial Foreground Predictor filters background noise
Information Bottleneck-guided Dynamic Token Compression
Online SVD adaptively compresses redundant foreground tokens
🔎 Similar Papers
No similar papers found.
Sifan Zhou
Sifan Zhou
Southeast University
RoboticsM/LLMsSpatial AIQuantization
Y
Yichao Cao
Central South University
J
Jiahao Nie
Zhejiang University of Finance and Economics
Y
Yuqian Fu
INSAIT, Sofia University "St. Kliment Ohridski"
Ziyu Zhao
Ziyu Zhao
University of South Carolina
computer vision. 2D/3D segmentationGenerative 3D reconstruction
X
Xiaobo Lu
School of Automation, Southeast University; Key Laboratory of Measurement and Control of Complex Systems of Engineering, Ministry of Education, Southeast University, Nanjing, China
S
Shuo Wang
Mininglamp Technology