DLTPose: 6DoF Pose Estimation From Accurate Dense Surface Point Estimates

📅 2025-04-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenging problem of 6DoF pose estimation for symmetric and heavily occluded objects in RGB-D images. Methodologically, it proposes a symmetry-aware, densely supervised pose regression framework: (1) it replaces fixed-order keypoints with a symmetry-aware keypoint ordering mechanism that explicitly models multiple valid pose configurations; (2) it performs per-pixel regression of radial distances to at least four keypoints, jointly leveraging RGB and depth features; and (3) it introduces an improved Direct Linear Transformation (DLT) solver that directly reconstructs high-accuracy 3D surface points and object poses from the dense radial distance maps. Evaluated on LINEMOD, Occlusion LINEMOD, and YCB-Video benchmarks, the method achieves mean recall rates of 86.5%, 79.7%, and 89.5%, respectively—outperforming state-of-the-art approaches, especially under severe occlusion and symmetry ambiguities.

Technology Category

Application Category

📝 Abstract
We propose DLTPose, a novel method for 6DoF object pose estimation from RGB-D images that combines the accuracy of sparse keypoint methods with the robustness of dense pixel-wise predictions. DLTPose predicts per-pixel radial distances to a set of minimally four keypoints, which are then fed into our novel Direct Linear Transform (DLT) formulation to produce accurate 3D object frame surface estimates, leading to better 6DoF pose estimation. Additionally, we introduce a novel symmetry-aware keypoint ordering approach, designed to handle object symmetries that otherwise cause inconsistencies in keypoint assignments. Previous keypoint-based methods relied on fixed keypoint orderings, which failed to account for the multiple valid configurations exhibited by symmetric objects, which our ordering approach exploits to enhance the model's ability to learn stable keypoint representations. Extensive experiments on the benchmark LINEMOD, Occlusion LINEMOD and YCB-Video datasets show that DLTPose outperforms existing methods, especially for symmetric and occluded objects, demonstrating superior Mean Average Recall values of 86.5% (LM), 79.7% (LM-O) and 89.5% (YCB-V). The code is available at https://anonymous.4open.science/r/DLTPose_/ .
Problem

Research questions and friction points this paper is trying to address.

Estimates 6DoF object pose from RGB-D images accurately
Handles object symmetries via novel keypoint ordering approach
Improves pose estimation for symmetric and occluded objects
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines sparse keypoint and dense pixel-wise predictions
Uses Direct Linear Transform for surface estimates
Introduces symmetry-aware keypoint ordering approach
🔎 Similar Papers
No similar papers found.
A
Akash Jadhav
Dept. of Electrical and Computer Engineering, Ingenuity Labs Research Institute, Queen’s University, Kingston, Ontario, Canada
Michael Greenspan
Michael Greenspan
Professor of Electrical and Computer Engineering, Queen’s University
computer vision