Efficient 3D Perception on Multi-Sweep Point Cloud with Gumbel Spatial Pruning

📅 2024-11-12
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of detecting distant and occluded objects in outdoor sparse LiDAR point clouds, this paper proposes an efficient multi-frame LiDAR perception framework. The method integrates multi-sweep point cloud accumulation, BEV feature projection, and a Transformer-based architecture. Key contributions include: (1) a lightweight, end-to-end learnable Gumbel Spatial Pruning (GSP) layer that dynamically removes redundant points—enabling plug-and-play deployment with zero computational overhead; (2) the first extension of effective frame fusion from 10 to 40 frames; and (3) joint optimization of temporal aggregation and spatial reasoning. Evaluated on nuScenes, the framework achieves +3.2% mAP in 3D object detection and +2.8% mIoU in BEV map segmentation over the TransL baseline, while maintaining identical inference speed.

Technology Category

Application Category

📝 Abstract
This paper studies point cloud perception within outdoor environments. Existing methods face limitations in recognizing objects located at a distance or occluded, due to the sparse nature of outdoor point clouds. In this work, we observe a significant mitigation of this problem by accumulating multiple temporally consecutive LiDAR sweeps, resulting in a remarkable improvement in perception accuracy. However, the computation cost also increases, hindering previous approaches from utilizing a large number of LiDAR sweeps. To tackle this challenge, we find that a considerable portion of points in the accumulated point cloud is redundant, and discarding these points has minimal impact on perception accuracy. We introduce a simple yet effective Gumbel Spatial Pruning (GSP) layer that dynamically prunes points based on a learned end-to-end sampling. The GSP layer is decoupled from other network components and thus can be seamlessly integrated into existing point cloud network architectures. Without incurring additional computational overhead, we increase the number of LiDAR sweeps from 10, a common practice, to as many as 40. Consequently, there is a significant enhancement in perception performance. For instance, in nuScenes 3D object detection and BEV map segmentation tasks, our pruning strategy improves the vanilla TransL baseline and other baseline methods.
Problem

Research questions and friction points this paper is trying to address.

Improves 3D object detection in sparse outdoor point clouds
Reduces computational cost in multi-sweep LiDAR processing
Enhances perception accuracy with Gumbel Spatial Pruning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Gumbel Spatial Pruning layer
Dynamic point pruning
Increased LiDAR sweeps
J
Jianhao Li
Department of Computer Science and Engineering, Beihang University, China
Tianyu Sun
Tianyu Sun
Indiana University
Partial differential equationsScientific computing
X
Xueqian Zhang
Department of Electronic Engineering, Tsinghua University, China
Zhongdao Wang
Zhongdao Wang
Noah's Ark Lab, Huawei
computer visionautonomous driving
B
Bailan Feng
Noah’s Ark Lab, Beijing, China
Hengshuang Zhao
Hengshuang Zhao
The University of Hong Kong
Computer VisionMachine LearningArtificial Intelligence