🤖 AI Summary
To address the challenge of applying latent diffusion models (LDMs) to point cloud completion—hindered by the intrinsic permutation invariance of point clouds—this paper proposes the first LDM-based coarse-to-fine framework. Methodologically, incomplete point clouds are first projected into multi-view depth maps; a novel DepthLDM is introduced to synthesize complete depth maps and reconstruct an initial coarse point cloud. Subsequently, a point-wise distance-score prediction denoising network and an association-aware upsampling module jointly optimize structural consistency and geometric fidelity. The core contribution lies in pioneering the integration of LDMs into point cloud completion, establishing a generative paradigm in depth-map space, and incorporating geometric constraints with local relational features. Extensive experiments on multiple benchmarks demonstrate significant improvements in geometric accuracy, shape completeness, and visual quality, achieving state-of-the-art performance.
📝 Abstract
Latent diffusion models (LDMs) have demonstrated remarkable generative capabilities across various low-level vision tasks. However, their potential for point cloud completion remains underexplored due to the unstructured and irregular nature of point clouds. In this work, we propose DiffPCN, a novel diffusion-based coarse-to-fine framework for point cloud completion. Our approach comprises two stages: an initial stage for generating coarse point clouds, and a refinement stage that improves their quality through point denoising and upsampling. Specifically, we first project the unordered and irregular partial point cloud into structured depth images, which serve as conditions for a well-designed DepthLDM to synthesize completed multi-view depth images that are used to form coarse point clouds. In this way, our DiffPCN can yield high-quality and high-completeness coarse point clouds by leveraging LDM' s powerful generation and comprehension capabilities. Then, since LDMs inevitably introduce outliers into the generated depth maps, we design a Point Denoising Network to remove artifacts from the coarse point cloud by predicting a per-point distance score. Finally, we devise an Association-Aware Point Upsampler, which guides the upsampling process by leveraging local association features between the input point cloud and the corresponding coarse points, further yielding a dense and high-fidelity output. Experimental results demonstrate that our DiffPCN achieves state-of-the-art performance in geometric accuracy and shape completeness, significantly improving the robustness and consistency of point cloud completion.