🤖 AI Summary
In category-level 6D pose estimation from RGB monocular images, the canonical NOCS coordinate mapping introduces intra-class shape variation interference—i.e., a single NOCS coordinate corresponds to multiple shape-varying instances within the same category, limiting geometrically guided regression performance. To address this, we propose GIVEPose, a novel framework featuring three key contributions: (1) Intra-Class Variation-Free Consensus (IVFC) coordinates—a first-of-its-kind representation that explicitly decouples category-level canonical geometry from instance-specific shape variations; (2) a dual-coordinate mapping fusion mechanism that jointly leverages NOCS and IVFC maps for progressive shape-variation suppression; and (3) a feature-decoupling and pose-optimization network enhancing geometric guidance and category-level generalization. Extensive experiments on both synthetic and real-world benchmarks demonstrate significant improvements over state-of-the-art RGB-only methods, achieving notably higher pose accuracy. The code is publicly available.
📝 Abstract
Recent advances in RGBD-based category-level object pose estimation have been limited by their reliance on precise depth information, restricting their broader applicability. In response, RGB-based methods have been developed. Among these methods, geometry-guided pose regression that originated from instance-level tasks has demonstrated strong performance. However, we argue that the NOCS map is an inadequate intermediate representation for geometry-guided pose regression method, as its many-to-one correspondence with category-level pose introduces redundant instance-specific information, resulting in suboptimal results. This paper identifies the intra-class variation problem inherent in pose regression based solely on the NOCS map and proposes the Intra-class Variation-Free Consensus (IVFC) map, a novel coordinate representation generated from the category-level consensus model. By leveraging the complementary strengths of the NOCS map and the IVFC map, we introduce GIVEPose, a framework that implements Gradual Intra-class Variation Elimination for category-level object pose estimation. Extensive evaluations on both synthetic and real-world datasets demonstrate that GIVEPose significantly outperforms existing state-of-the-art RGB-based approaches, achieving substantial improvements in category-level object pose estimation. Our code is available at https://github.com/ziqin-h/GIVEPose.