Co-op: Correspondence-based Novel Object Pose Estimation

📅 2025-03-22

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

This paper addresses zero-shot, single-image RGB-based 6DoF object pose estimation: given only the CAD model of a previously unseen object, no training or fine-tuning is required. Methodologically, we introduce a semi-dense image-to-template correspondence matching mechanism; design a hybrid representation jointly encoding block-level classification and offset regression; and propose a two-stage pose solver integrating probabilistic optical flow initialization with differentiable PnP optimization. Our core contribution is the first unification of semi-dense correspondence, hybrid representation, and differentiable geometric optimization within a zero-shot framework. The method achieves state-of-the-art accuracy across all seven benchmarks of the BOP Challenge, demonstrating superior generalization, real-time inference speed, and robustness—significantly outperforming existing model-driven approaches.

Technology Category

Application Category

📝 Abstract

We propose Co-op, a novel method for accurately and robustly estimating the 6DoF pose of objects unseen during training from a single RGB image. Our method requires only the CAD model of the target object and can precisely estimate its pose without any additional fine-tuning. While existing model-based methods suffer from inefficiency due to using a large number of templates, our method enables fast and accurate estimation with a small number of templates. This improvement is achieved by finding semi-dense correspondences between the input image and the pre-rendered templates. Our method achieves strong generalization performance by leveraging a hybrid representation that combines patch-level classification and offset regression. Additionally, our pose refinement model estimates probabilistic flow between the input image and the rendered image, refining the initial estimate to an accurate pose using a differentiable PnP layer. We demonstrate that our method not only estimates object poses rapidly but also outperforms existing methods by a large margin on the seven core datasets of the BOP Challenge, achieving state-of-the-art accuracy.

Problem

Research questions and friction points this paper is trying to address.

Estimating 6DoF pose of unseen objects from single RGB image

Reducing template inefficiency in model-based pose estimation

Achieving state-of-the-art accuracy without fine-tuning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses semi-dense correspondences for pose estimation

Leverages hybrid patch-level classification and offset regression

Refines pose with probabilistic flow and differentiable PnP

🔎 Similar Papers

No similar papers found.