UNOPose: Unseen Object Pose Estimation with an Unposed RGB-D Reference Image

📅 2024-11-25

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

169K/year

🤖 AI Summary

This work addresses the challenging problem of estimating the 6D pose of an unknown object from a single RGB-D reference image without pose annotations, eliminating reliance on CAD models or multi-view inputs. We propose a SE(3)-invariant reference coordinate system and a learnable correspondence confidence reweighting mechanism. Within a coarse-to-fine framework, our method jointly leverages geometry-invariant feature extraction and deep RGB-D fusion to robustly handle large pose discrepancies, low viewpoint overlap, and sensor noise. Evaluated on the BOP benchmark, our approach significantly outperforms existing single-reference methods—both traditional and learning-based—while approaching the performance of CAD-dependent approaches. To foster reproducibility and further research, we release both the source code and the dataset.

Technology Category

Application Category

📝 Abstract

Unseen object pose estimation methods often rely on CAD models or multiple reference views, making the onboarding stage costly. To simplify reference acquisition, we aim to estimate the unseen object's pose through a single unposed RGB-D reference image. While previous works leverage reference images as pose anchors to limit the range of relative pose, our scenario presents significant challenges since the relative transformation could vary across the entire SE(3) space. Moreover, factors like occlusion, sensor noise, and extreme geometry could result in low viewpoint overlap. To address these challenges, we present a novel approach and benchmark, termed UNOPose, for unseen one-reference-based object pose estimation. Building upon a coarse-to-fine paradigm, UNOPose constructs an SE(3)-invariant reference frame to standardize object representation despite pose and size variations. To alleviate small overlap across viewpoints, we recalibrate the weight of each correspondence based on its predicted likelihood of being within the overlapping region. Evaluated on our proposed benchmark based on the BOP Challenge, UNOPose demonstrates superior performance, significantly outperforming traditional and learning-based methods in the one-reference setting and remaining competitive with CAD-model-based methods. The code and dataset are available at https://github.com/shanice-l/UNOPose.

Problem

Research questions and friction points this paper is trying to address.

Estimating unseen object pose using single unposed RGB-D image

Addressing challenges of SE(3) space and low viewpoint overlap

Proposing UNOPose for superior one-reference-based pose estimation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Single unposed RGB-D image for pose estimation

SE(3)-invariant reference frame standardization

Recalibration of correspondence weights for overlap

🔎 Similar Papers

DVMNet++: Rethinking Relative Pose Estimation for Unseen Objects