4D-CS: Exploiting Cluster Prior for 4D Spatio-Temporal LiDAR Semantic Segmentation

📅 2025-01-01
🏛️ IEEE Robotics and Automation Letters
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address temporal inconsistency in 4D semantic segmentation of multi-frame LiDAR point clouds, this paper proposes a spatiotemporally consistent dual-branch network. It explicitly models foreground object cluster priors to generate temporally stable cluster-level labels; introduces a point-cluster adaptive weighting fusion mechanism to jointly optimize point-level and cluster-level features; and incorporates a neighboring-cluster cross-frame merging strategy to mitigate feature incompleteness caused by occlusion. The method achieves significant improvements in motion-object segmentation consistency on SemanticKITTI and nuScenes, attaining state-of-the-art performance on both multi-frame semantic segmentation and motion-object segmentation benchmarks. Its core innovation lies in being the first to integrate explicit cluster prior modeling with adaptive point-cluster fusion into a 4D segmentation framework, thereby substantially enhancing spatiotemporal semantic consistency.

Technology Category

Application Category

📝 Abstract
Semantic segmentation of LiDAR points has significant value for autonomous driving and mobile robot systems. Most approaches explore spatio-temporal information of multi-scan to identify the semantic classes and motion states for each point. However, these methods often overlook the segmentation consistency in space and time, which may result in point clouds within the same object being predicted as different categories. To handle this issue, our core idea is to generate cluster labels across multiple frames that can reflect the complete spatial structure and temporal information of objects. These labels serve as explicit guidance for our dual-branch network, 4D-CS, which integrates point-based and cluster-based branches to enable more consistent segmentation. Specifically, in the point-based branch, we leverage historical knowledge to enrich the current feature through temporal fusion on multiple views. In the cluster-based branch, we propose a new strategy to produce cluster labels of foreground objects and apply them to gather point-wise information to derive cluster features. We then merge neighboring clusters across multiple scans to restore missing features due to occlusion. Finally, in the point-cluster fusion stage, we adaptively fuse the information from the two branches to optimize segmentation results. Extensive experiments confirm the effectiveness of the proposed method, and we achieve state-of-the-art results on the multi-scan semantic and moving object segmentation on SemanticKITTI and nuScenes datasets.
Problem

Research questions and friction points this paper is trying to address.

LiDAR Semantic Segmentation
Object Recognition
Autonomous Driving
Innovation

Methods, ideas, or system contributions that make the work stand out.

4D-CS method
Dual-branch network
Historical data integration
🔎 Similar Papers
No similar papers found.
J
Jiexi Zhong
Faculty of Robot Science and Engineering, Northeastern University, Shenyang 110819, China; National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Northeastern University, Shenyang 110819, China; Key Laboratory of Data Analytics and Optimization for Smart Industry, Ministry of Education, Northeastern University, Shenyang 110819, China
Z
Zhiheng Li
Faculty of Robot Science and Engineering, Northeastern University, Shenyang 110819, China
Yubo Cui
Yubo Cui
Northeastern University
3d computer visionobject trackingrobot
Z
Zheng Fang
Faculty of Robot Science and Engineering, Northeastern University, Shenyang 110819, China; National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Northeastern University, Shenyang 110819, China; Key Laboratory of Data Analytics and Optimization for Smart Industry, Ministry of Education, Northeastern University, Shenyang 110819, China