ComPose: A Unified Completion-Pose Framework for Robust Category-Level Object Pose Estimation

📅 2026-05-25

📈 Citations: 0

✨ Influential: 0

career value

160K/year

🤖 AI Summary

This work addresses the challenge of insufficient shape awareness in category-level object pose estimation caused by incomplete point cloud observations. To this end, the authors propose a unified end-to-end framework that tightly integrates point cloud completion with pose estimation. The method introduces a novel keypoint-guided progressive completion mechanism that effectively fuses local and global geometric context. Furthermore, it incorporates a geometric relation consistency loss, enabling the generation of complete and coherent object geometries without relying on category-specific shape priors, thereby facilitating robust pose inference. Experimental results demonstrate that the proposed approach significantly outperforms state-of-the-art methods on standard benchmarks, achieving higher accuracy and improved robustness in pose estimation.

📝 Abstract

Category-level object pose estimation aims to predict the pose and size of arbitrary objects in specific categories. Existing methods struggle with the inherent incompleteness of observed point clouds, which limits their ability to capture complete object shapes for robust pose reasoning. While point cloud completion offers a promising solution, naively treating it as a separate preprocessing step for partial observations introduces compounding errors and additional computational overhead, ultimately hindering both accuracy and efficiency. To address these challenges, we propose ComPose, a novel unified framework that tightly integrates shape completion to provide complete geometric cues for enhanced pose estimation. At the core of ComPose is a keypoint-based progressive completion module, which recovers full shape representations by progressively predicting a sparse set of keypoints and their surrounding dense point sets, empowering the keypoints to capture holistic object geometries. A geometric relation encoding module further enriches keypoint features with both local and global geometric context. In addition, we introduce a novel geometric relation consistency loss to enforce structural alignment between observed keypoints and their predicted NOCS coordinates, ensuring globally coherent coordinate transformations. Extensive experiments on standard benchmarks demonstrate that our method outperforms state-of-the-art approaches without relying on category-level shape priors.

Problem

Research questions and friction points this paper is trying to address.

category-level object pose estimation

point cloud completion

incomplete observations

shape representation

geometric reasoning

Innovation

Methods, ideas, or system contributions that make the work stand out.

unified completion-pose framework

keypoint-based progressive completion

geometric relation encoding