🤖 AI Summary
Existing mask-based point cloud contextual learning methods lack geometric priors and suffer from a mismatch between training and inference objectives, hindering effective modeling of spatial structure. To address this, this work proposes DeformPIC, the first framework to incorporate deformable mechanisms into point cloud contextual learning. Instead of relying on masked reconstruction, DeformPIC leverages task-specific prompts to guide explicit deformation of query point clouds, aligning training and inference goals while avoiding dependence on unseen target information. By integrating deformable modeling, prompt-driven task conditioning, and a Transformer architecture, the method achieves consistent improvements across multiple tasks: it reduces average Chamfer Distance by 1.6, 1.8, and 4.7 on reconstruction, denoising, and registration, respectively, and establishes state-of-the-art performance on a newly introduced cross-domain generalization benchmark, significantly enhancing geometric understanding and generalization capabilities.
📝 Abstract
Recent advances in point cloud In-Context Learning (ICL) have demonstrated strong multitask capabilities. Existing approaches typically adopt a Masked Point Modeling (MPM)-based paradigm for point cloud ICL. However, MPM-based methods directly predict the target point cloud from masked tokens without leveraging geometric priors, requiring the model to infer spatial structure and geometric details solely from token-level correlations via transformers. Additionally, these methods suffer from a training-inference objective mismatch, as the model learns to predict the target point cloud using target-side information that is unavailable at inference time. To address these challenges, we propose DeformPIC, a deformation-based framework for point cloud ICL. Unlike existing approaches that rely on masked reconstruction, DeformPIC learns to deform the query point cloud under task-specific guidance from prompts, enabling explicit geometric reasoning and consistent objectives. Extensive experiments demonstrate that DeformPIC consistently outperforms previous state-of-the-art methods, achieving reductions of 1.6, 1.8, and 4.7 points in average Chamfer Distance on reconstruction, denoising, and registration tasks, respectively. Furthermore, we introduce a new out-of-domain benchmark to evaluate generalization across unseen data distributions, where DeformPIC achieves state-of-the-art performance.