🤖 AI Summary
To address the low 6D pose estimation accuracy for textureless objects under complex illumination, this paper proposes GS2POSE—the first differentiable rendering framework for 6D pose estimation that incorporates 3D Gaussian splatting. Our method establishes a pose-differentiable Gaussian rendering pipeline, enabling end-to-end pose regression via Lie algebra parameterization. It jointly optimizes pose and illumination-dependent color parameters to enhance lighting robustness, and introduces a bundle adjustment–style iterative optimization strategy to improve geometric consistency. Evaluated on T-LESS, LineMOD-Occlusion, and LineMOD, GS2POSE achieves absolute ADD(-S) improvements of 1.4%, 2.8%, and 2.5%, respectively, outperforming state-of-the-art methods. Key contributions include: (1) pioneering the differentiable modeling of 3D Gaussian splatting for 6D pose estimation; and (2) enabling co-optimization of pose and illumination parameters, balancing accuracy and generalizability.
📝 Abstract
Accurate 6D pose estimation of 3D objects is a fundamental task in computer vision, and current research typically predicts the 6D pose by establishing correspondences between 2D image features and 3D model features. However, these methods often face difficulties with textureless objects and varying illumination conditions. To overcome these limitations, we propose GS2POSE, a novel approach for 6D object pose estimation. GS2POSE formulates a pose regression algorithm inspired by the principles of Bundle Adjustment (BA). By leveraging Lie algebra, we extend the capabilities of 3DGS to develop a pose-differentiable rendering pipeline, which iteratively optimizes the pose by comparing the input image to the rendered image. Additionally, GS2POSE updates color parameters within the 3DGS model, enhancing its adaptability to changes in illumination. Compared to previous models, GS2POSE demonstrates accuracy improvements of 1.4%, 2.8% and 2.5% on the T-LESS, LineMod-Occlusion and LineMod datasets, respectively.