Unlocking Generalization Power in LiDAR Point Cloud Registration

📅 2025-03-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing LiDAR point cloud registration methods exhibit insufficient generalization across varying distances and disparate datasets, compromising safety-critical applications such as autonomous driving. To address this, we propose a lightweight, generalization-oriented Transformer architecture: (i) we eliminate cross-frame attention to enhance out-of-distribution robustness; (ii) we introduce a progressive self-attention module to mitigate structural ambiguities in large-scale scenes; and (iii) we fuse bird’s-eye-view (BEV) semantic features to improve geometric–semantic consistency. Trained in a fully unsupervised manner, our method achieves state-of-the-art generalization performance: 94.5% and 91.4% registration recall on KITTI and nuScenes for cross-distance evaluation, respectively, and 90.9% recall for cross-dataset transfer from nuScenes to KITTI—surpassing all prior methods.

Technology Category

Application Category

📝 Abstract
In real-world environments, a LiDAR point cloud registration method with robust generalization capabilities (across varying distances and datasets) is crucial for ensuring safety in autonomous driving and other LiDAR-based applications. However, current methods fall short in achieving this level of generalization. To address these limitations, we propose UGP, a pruned framework designed to enhance generalization power for LiDAR point cloud registration. The core insight in UGP is the elimination of cross-attention mechanisms to improve generalization, allowing the network to concentrate on intra-frame feature extraction. Additionally, we introduce a progressive self-attention module to reduce ambiguity in large-scale scenes and integrate Bird's Eye View (BEV) features to incorporate semantic information about scene elements. Together, these enhancements significantly boost the network's generalization performance. We validated our approach through various generalization experiments in multiple outdoor scenes. In cross-distance generalization experiments on KITTI and nuScenes, UGP achieved state-of-the-art mean Registration Recall rates of 94.5% and 91.4%, respectively. In cross-dataset generalization from nuScenes to KITTI, UGP achieved a state-of-the-art mean Registration Recall of 90.9%. Code will be available at https://github.com/peakpang/UGP.
Problem

Research questions and friction points this paper is trying to address.

Enhancing generalization in LiDAR point cloud registration.
Addressing limitations in cross-distance and cross-dataset performance.
Improving safety in autonomous driving via robust registration methods.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Eliminates cross-attention for better generalization
Introduces progressive self-attention for scene clarity
Integrates BEV features for semantic scene understanding
Z
Zhenxuan Zeng
School of Computer Science, Northwestern Polytechnical University, China; Ningbo Institute, Northwestern Polytechnical University, China.
Q
Qiao Wu
School of Computer Science, Northwestern Polytechnical University, China; Ningbo Institute, Northwestern Polytechnical University, China.
Xiyu Zhang
Xiyu Zhang
Northwestern Polytechnical University
3D computer visionpoint cloud processinggraph neural networks
Lin Yuanbo Wu
Lin Yuanbo Wu
Swansea University
Computer VisionAI GenerationTrustworthy AIAutonomous SystemEmbodied Visual Intelligence
P
Pei An
HuaZhong University of Science and Technology, China.
J
Jiaqi Yang
School of Computer Science, Northwestern Polytechnical University, China; Ningbo Institute, Northwestern Polytechnical University, China.
J
Ji Wang
School of Computer Science, Northwestern Polytechnical University, China; Ningbo Institute, Northwestern Polytechnical University, China.
P
Peng Wang
School of Computer Science, Northwestern Polytechnical University, China; Ningbo Institute, Northwestern Polytechnical University, China.