🤖 AI Summary
To address the reliance on large-scale annotated training data and poor generalization in point cloud registration, this paper proposes the first zero-shot registration optimization framework based on pre-trained diffusion models. Without fine-tuning or additional training, our method extracts geometric structure via multi-view depth map projection and leverages the frozen encoder of a pre-trained diffusion model (e.g., Stable Diffusion) to obtain semantically rich depth-diffusion features. These geometric and diffusion features are then fused to refine initial correspondences. Evaluated on standard benchmarks—including ModelNet40 and 3DMatch—our approach consistently improves the accuracy of mainstream registrars such as ICP and RPM-Net, reducing average rotation error by 18.7%. Notably, it demonstrates strong cross-dataset generalization without domain-specific adaptation. The source code is publicly available.
📝 Abstract
Recent research leveraging large-scale pretrained diffusion models has demonstrated the potential of using diffusion features to establish semantic correspondences in images. Inspired by advancements in diffusion-based techniques, we propose a novel zero-shot method for refining point cloud registration algorithms. Our approach leverages correspondences derived from depth images to enhance point feature representations, eliminating the need for a dedicated training dataset. Specifically, we first project the point cloud into depth maps from multiple perspectives and extract implicit knowledge from a pretrained diffusion network as depth diffusion features. These features are then integrated with geometric features obtained from existing methods to establish more accurate correspondences between point clouds. By leveraging these refined correspondences, our approach achieves significantly improved registration accuracy. Extensive experiments demonstrate that our method not only enhances the performance of existing point cloud registration techniques but also exhibits robust generalization capabilities across diverse datasets. Codes are available at https://github.com/zhengcy-lambo/RARE.git.