Any6D: Model-free 6D Pose Estimation of Novel Objects

📅 2025-03-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the challenging problem of estimating the 6D pose and metric scale of an unknown object from a single RGB-D image—without relying on any prior 3D models. We propose the first model-agnostic, single-image-driven framework for joint 2D–3D alignment and metric scale estimation, integrating differentiable rendering, iterative pose hypothesis generation, and geometry-appearance co-optimization to handle severe occlusion, cross-scene generalization, and illumination variations. Crucially, our method eliminates dependence on CAD models or category-level priors, enabling truly zero-shot joint pose and size estimation for novel objects. Evaluated on five standard benchmarks—including REAL275 and HO3D—our approach significantly outperforms state-of-the-art methods, especially in zero-shot 6D pose estimation for unseen objects. The results establish a new paradigm for open-world 6D pose estimation.

Technology Category

Application Category

📝 Abstract
We introduce Any6D, a model-free framework for 6D object pose estimation that requires only a single RGB-D anchor image to estimate both the 6D pose and size of unknown objects in novel scenes. Unlike existing methods that rely on textured 3D models or multiple viewpoints, Any6D leverages a joint object alignment process to enhance 2D-3D alignment and metric scale estimation for improved pose accuracy. Our approach integrates a render-and-compare strategy to generate and refine pose hypotheses, enabling robust performance in scenarios with occlusions, non-overlapping views, diverse lighting conditions, and large cross-environment variations. We evaluate our method on five challenging datasets: REAL275, Toyota-Light, HO3D, YCBINEOAT, and LM-O, demonstrating its effectiveness in significantly outperforming state-of-the-art methods for novel object pose estimation. Project page: https://taeyeop.com/any6d
Problem

Research questions and friction points this paper is trying to address.

Estimates 6D pose and size of unknown objects from single RGB-D image
Improves alignment and scale estimation without textured 3D models
Handles occlusions, lighting variations, and cross-environment challenges
Innovation

Methods, ideas, or system contributions that make the work stand out.

Single RGB-D anchor image for pose estimation
Joint object alignment enhances 2D-3D accuracy
Render-and-compare strategy refines pose hypotheses
🔎 Similar Papers